Download Table of Contents - CECS Multimedia Communications and

Document related concepts

Open Database Connectivity wikipedia , lookup

IMDb wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Database wikipedia , lookup

Relational model wikipedia , lookup

Functional Database Model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
Table of Contents
1. Problem Definition And Literature Review………..........………………...... 1
1.1 Introductions ………………………………………………..………………...... 2
1.2 Background ……………………………………………………...…………...… 3
1.3 Literature Review ………………………………………………………..…...... 5
1.3.1 Introduction ………………………………………………………………...... 5
1.3.2 Literature Paper Review …………………………………………………….. 6
1.3.2.1 Answering Imprecise Database Queries: A Novel Approach …………. 6
1.3.2.2 Structuring Keyword-Based Queries for web Databases …………....... 8
1.3.2.3 Testing Web Database Applications …………………………………... 10
1.3.2.4 Generating Web-Based Systems from Specifications …………….…… 11
1.3.2.5 Structured Databases on the Web: Observations and Implications ...… 13
1.3.2.6 Web Mining Research: A Survey ……………………………………... 14
1.3.2.7 Editorial: Special Issue on Web Content Mining ……………………... 16
1.3.2.8 WEBKDD 2002 – Web Mining for Usage Patterns & Profiles ……….. 16
1.3.2.9 A Web Personalization System based on Web Usage Mining
Techniques ………………………………………………………...… 18
1.3.2.10 Assessing the Quality of Auction Web Sites ………………………… 19
1.3.2.11 Designing Multinational Online Stores: Challenges, Implementation
Techniques and Experience………………………………………..… 20
1.3.2.12 Customer-centered Rules for Design of E-commerce Web sites …….. 21
1.3.2.13 A Comparative Usability Evaluation of User Interfaces for
Online2Product Catalog ...……...…………………………………… 22
1.3.2.14 Effects of Scent and Breadth on Use of Site-specific Search on Ecommerce Web sites ………………………………………………… 23
1.3.2.15 The Dynamics of Mass Online Marketplaces: A Case Study of An
Online Auction ………………………………………………………. 23
1.3.3 Literature Survey Conclusion ……………………………………………….. 24
1.4 Goals and Objectives …………………………………………………………... 27
1.5 Overall Approach …………………………………………………………….....31
2. Requirements Analysis………..........………………........................................... 31
2.1 Introductions ………………………………………………..………………...... 32
2.2 Overall Description………………………………………………...………….... 32
2.3 System Requirements And Constraints…...……………………………..…..... 33
2.3.1 Operating Environment (External Constraints)..….………………...……...... 33
2.3.2 Market Users and Characteristics ……………….….……………………...... 34
2.3.3 Environmental Constraints …………………….….…………………….........35
2.3.4 System Components ………………………………………………………… 36
2.3.5 Software Interface and Library ……………………………………………… 38
2.3.6 System Maintenance ………………………………………………………… 39
2.4 Performance Requirements……………………………………………………. 40
2.5 Resource Requirements…………………………………………….…………... 40
2.6 Alternative Solution ……………………………………………………………. 41
2.7 Evaluation Metrics ……………………………………………………………... 41
3. Design Specifications …..……..........………………........................................... 42
3.1 Introductions ………………………………………………..………………...... 43
3.2 System Design Overview……………………………………...…………........... 43
3.3 Data Requirements ….......................................................................................... 44
3.4 Software Design…………………………………………………………………. 44
3.5 Testing Methods…................................................................................................ 49
3.6 Scheduling Diagrams with Task Assignments………………………………... 51
3.7 Design Implementation Costs........……………….............................................. 51
4. System Implementation……..........……………….............................................. 52
4.1 Introduction & System Implement Overview…………..……………….......... 53
4.2 System Database Implementation....................................................................... 53
4.3 System Basic Interface..........................................................................................54
4.4 System User Login / logout & Register Function............................................... 55
4.4.1 User Register Function….………………...……............................................. 56
4.4.2 User Login / logout Function............................................................................ 57
4.5 Sale Function……………………………………………………………………. 58
4.5.1 Book Function….………………...……........................................................... 58
4.5.2 Car Function......................................................................................................58
4.6 News Function ………………………………………….…………..................... 59
5. System Performance, Testing & Evaluation……............................................62
5.1 Introduction & System Overview…………………………..………………...... 63
5.2 System Performance ……………………………………...…………................. 63
5.3 System Testing …................................................................................................. 64
5.4 System Evaluation………………………………………………………………. 65
6. Conclusions & Discussion…..........………………............................................. 66
6.1 Introduction & Overview…………………………………..………………....... 67
6.2 Conclusions……………………………………...…………................................. 67
6.2.1 Wang Yi’s Conclusion….………………...…….............................................. 67
6.2.2 Wen Lei’s Conclusion.......................................................................................67
6.2.3 Carol Lim’s Conclusion.................................................................................... 67
6.3 Discussion …......................................................................................................... 68
7. Future Works …..…….........................………………........................................... 69
7.1 Future Works………………………………………………..………………...... 70
8. References ………………..……..........………………........................................... 71
References
[01] A. A. Elbibas, M J. Ridley. “Using Metadata for Developing Automated Web
System
Interface”.
1st
international
symposium
on
Information
and
communication technologies, p.p. 113-118, September 2003.
[02] BIGSF. “Government Web Application Integrity”. The Business Internet Group of
San Francisco, 2003.
[03] U. Nambiar, S. Kambhampati. “Answering Imprecise database Queries: A Novel
Approach”. ACM International Workshop on Web Information and Data
Management, pp. 126-133, 2003.
[04] R. C. Vieira etc. “Structuring Keyword-Based Queries for Web Databases”. 2nd
ACM/IEEE-CS joint conference on Digital libraries, pp. 94-95, 2002.
[05] Y. Deng, P. Franki, J. Wang. “Testing Web Database applications”, ACM SIGSOFT
Software Engineering Notes, Vol. 29, Issue 5, pp. 01-10, September 2004.
[06] T. B. Jensen, T. K. Tolstrup, M. R. Hansen. “Generating Web-Based Systems from
Specifications”, ACM symposium on Applied computing, pp. 1647-1653, 2004.
[07] K. Chang etc. “Structured Databases on the Web: Observations and Implications”,
ACM SIGMOD Record, Vol. 33 Issue 3, pp. 61-70, September 2004.
[08] R. Kosala, H. Blockeel, Web Mining Research: A Survey, SIGKDD Explorations
Vol.2, Issue 1, pp. 1-15. June 2000.
[09] B. Liu, K. C. Chen-Chuan, Editorial: Special Issue on Web Content Mining,
SIGKDD Explorations Vol. 6, Issue 2, pp. 1-4. December 2004.
[10] B. M. Masand, M. Spiliopoulou, J. Srivastava, O. R. Zaiane, WEBKDD 2002 – Web
Mining for Usage Patterns & Profiles. SIGKDD Explorations Vol. 4, Issue 2, pp.
125-127. December 2002.
[11] M. Albanese, A. Picariello, C. Sansone, L. Sansone, A Web Personalization System
based on Web Usage Mining Techniques. WWW2004, New York, USA. May 1722, 2004.
[12] S. J. Barnes, R. T. Vidgen, Assessing the Quality of Auction Web Sites. Proceedings
of the 34th Annual Hawaii International Conference on System Science, IEEE,
2001.
[13] S. Cherry, “eBuyer, Beware”. IEEE Spectrum, October 2004.
[14] Y. Chan, H. Suwanda, “Designing multinational online stores: challenges,
implementation techniques and experience”, Conference of the Centre for
Advanced Studies on Collaborative research, November 2000.
[15] X. Fang, G. Salvendy, “Customer-centered rules for design of e-commerce Web
sites”, Communications of the ACM, Vol. 46 Issue 12, December 2003.
[16] E. Callahan, J. Koenemann, “A comparative usability evaluation of user interfaces
for online product catalog”, 2nd ACM conference on Electronic commerce,
October 2000.
[17] M. A. Katz, M. D. Byrne, “Effects of scent and breadth on use of site-specific search
on e-commerce Web sites”, ACM Transactions on Computer-Human Interaction
(TOCHI), Vol. 10 Issue 3, September 2003.
[18] J. Hahn, “The dynamics of mass online marketplaces: a case study of an online
auction”, ACM Press New York, NY, USA, pp 317 – 324, March 2001.
[19] Database Create Table List
1.1 Introduction
Along with the fast growing and significant business innovation, the E-commerce
market has fueled the internet business. Enterprise websites such as www.ebay.com or
www.amazon.com are bringing consumers into their business online. Online shopping
becomes more popular and more people intend to purchase products online rather than go
to stores. Users can buy or sell their items worldwide in the internet. However, big items
such as cars or furniture cost a lot of shipping and handling.
Currently, most people in a local area will post their information on a bulletin
board if they want to sell their goods without paying advertisement fees. For example, if
a student wants to sell her used book, she will print out numerous hard copies that contain
relevant information such as title of the book, price, and contact information. Then, she
will post these copies on local bulletin boards as many as she could. Conversely, this
method has three disadvantages. First, it is very inconvenient for the seller because the
seller has to post their copies in many places. Second, it has limited influence because
only those people who live near the bulletin board or pass by the bulletin board will take
a look on it if they have any relevant intentions. This means that many people may not
have a chance to read the posted information on the bulletin boards. Third, the posted
information is not “safe”. For examples, the copies may be tore by strong wind or by
somebody. The copies may be hidden by other new copies posted by other users.
We are going to design a web-based database system – MU Bulletin Board. It will
focus on the local market. The underlying idea of this project is to create an alternative
chance for users to buy or sell their unused items. Besides that, our system also provides
information or announcement regarding to campus our user activities. The ultimate goal
of our project is to provide users the opportunities to sell their items and to get
information about their local activities. MU Bulletin Board will be able to offer
alternative choices of items for users. Eventually, functionality will be provided to match
possible items based on users’ requirements.
1.2 Background
Since our project focuses on the local region and university students, we
interviewed Kathy, the director of the MU student association (MSA), and John, the
programmer of the MSA about the market demand for this kind of system and current
students’ actions. Coincidentally, they were working on an off campus life project, which
is similar to our renting function (http://offcampus.missouri.edu see Figure offcampus, an
interface sample). We did some research on that Web site and other MSA Web sites such
as roommate searching Web site (http://roommates.missouri.edu see Figure Roommates,
an interface sample). We found that each MSA Web site is focusing on a single idea or
method. If a student wanted to sell a used book and rent an apartment, he or she needed to
know and search on different Web sites. We did not think that it is convenient for
students. As students, we prefer to have one Web site contains “everything” that we are
interested in. They also mentioned that MU Bookstore had an “Ebay-like” Web site,
called TigerBay, for used book trading online. The system was shut down due to the
insufficient users. According to Kathy, many students did not know about the Web site
because the bookstore did not popularize the Web sites. In her opinion, the reason of the
bookstore did not advertise to the public is that the Web site may affect its business.
As stated before, we will design an online bulletin board system for local people,
called the MU Bulletin Board. Our system will have an advertisement function (similar to
what we have discussed) and many other functions such as a trade function and a share
function. Users can broadcast their trade information to everyone in one place. Anyone
can access this Web site for free if he/she has a computer that connects to Internet. For
example, the secretary of department of computer science of MU can use our system to
post a message to students and faculty instead of using email.
Figure 1.1 Off CAMPUS
Figure 1.2 Roommates
1.3 Literature Survey
1.3.1 Introduction
Web-based systems have been growing tremendously and they have changed our
lives dramatically in the past few years. In the early stage, users use static Web pages,
which are stored in the file system of a running Web server, to post and retrieve
information by hypertext interface. In this new era, developers use databases to support
dynamic Web pages since the most attractive features of current Web sites relate to
updates and queries such as shopping online and stock trading, according to [1].
Along with the increasing dependence of Web database applications of
commercial, scientific, and social activities, the developed systems have become more
complex than before. This situation leads to many problems such as Web application
failures, and unsatisfied results for users’ query. Moreover, data mining or Web mining is
an important issue to obtain accurate information quickly. We need to control the quality
of our project to maintain good customer qualities. Our paper will look at the recent
research strategies and different models for multinational online stores that are applied to
these problems.
1.3.2 Literature Paper Review
1.3.2.1 Answering Imprecise Database Queries: A Novel Approach
The rapid growth of the usage of the Internet has changed its users from highly
trained professionals to untrained regular people, so most web-based database systems on
the Internet provide form-based interface for users to interact with the database. The
form-base interface is easy to use for users because users need not to write complicated
queries, but it imposes strict constraints, which lead to the users getting unsatisfied
results, on the attribute values stored in the database. This problem can be solved by
providing ranked similar results if exact answers cannot be found. In this paper, authors
provide a domain-independent solution --- IQE (Imprecise Query Engine) that can
answer imprecise queries over a database without changing the existing database.
According to the authors, Imprecise Query is a user query that “requires data
closely matching the query constraint [3]”; Precise Query is a user query that “requires
data exactly matching the query constraint”. A relational database and its relation provide
a form-based interface on the Internet in order to query the data stored in the database.
For any users’ queries, the answers must exactly match the query constraint; otherwise an
empty set is returned, so all users’ queries are treated as precise queries. In order to
support imprecise queries over the database, the authors add IQE between the users and
the database [3] (See Figure 1.3 Imprecise Query Engine).
Figure 1.3 Imprecise Query Engine
According to [3], the authors use an existing Web database that only accepts
precise queries over a form-based interface. They add SimQuery Engine and Similarity
Estimator in this architecture. The SimQuery Engine will convert the imprecise query
into equivalent precise query and The Similarity Estimator will calculate the similarity
between each pair of queries. For example, after receiving users’ imprecise queries, IQE
will identify and return a set of precise queries that are relevant to the users’ given
queries. The results of testing the architecture indicate that the IQE is able to answer
imprecise queries with high levels of user satisfaction because the experiments show that
65% of the results related to users’ imprecise query.
According to the authors, I think this is a very useful approach because it can be
implemented without affecting the existing database. In addition, it can be implemented
on many existing databases easily. But this approach has two disadvantages. First, this
approach is very time consuming for large database (a database contains thousands of
precise queries). Second, system owners will spend more money on their system if they
adopt this approach.
1.3.2.2 Structuring Keyword-Based Queries for Web Databases
Most on-line information systems, which provide access to databases by Web, use
customized interfaces such as forms, navigation menus, and other browsing mechanisms.
According to the authors, the customized interfaces have two important disadvantages.
First, they may be too complicated for complex databases from the view point of the Web
users. For example, Web users may need to fill many fields or forms. Second, the
customized interfaces may increase cost for providing access to several distinct databases
from the view point of the system developers. In this paper, the authors propose a
framework that produces structured queries from a set of keywords given by the Web
users and uses information retrieval techniques to rank the answers [4].
According to the authors [4], there are four steps for the proposed framework.
First, it will receive the user’s unstructured query --- a set of keywords obtained via an
interface. This can be accomplished through the use of a single search box interface. The
second step involves building the structured queries. In order to accomplish this task, a
local repository of data is required. This local repository should contain attributes and a
set of values for each attribute. The attributes of the local repository are same as the
attributes of the database and the values are samples of the corresponding attributes
domain. In order to find the possible queries, the framework uses a Bayesian network,
according to [4] (See Figure 1.4), to compute the similarity between the individual term
of a query and the values stored in the local repository. Third, after ranking the structured
queries, either the highest ranked query may be selected or the user can choose one
among the best ranked queries. Finally, the selected query is processed and the returned
results are ranked.
Figure 1.4 A Bayesian network model for two attributes to evaluate structured queries
According to the authors [4], the experimental results of the framework are pretty
good. The complete correct percentage of ranked queries is 63% and 97% of the correct
queries were always among the top three ranked queries. In addition, for those not
complete correct queries, their attributes were assigned correctly in 69%.
We think the authors propose a useful framework since it can provide a simple
interface to Web users and system developers. The single search box interface is easy to
use, and will result in the reduction of the cost for interface development and
maintenance.
1.3.2.3 Testing Web Database Applications
More and more services depend on Web database system now. In order to provide
correct application functions and suitable protection to owners, testing and integrating
individual components of a web service system are very important. In general, a web
database application system contains three layers of application logic. Database
Management System (DBMS) and the database are the base; the client web browser that
is used as an interface to the application is on the top; the application logic that is
developed with a script language and interact with DBMS and HTML lies between them
(See Figure 1.5), according to [5].
Figure 1.5 Typical Web Database Application Configurations
Developers usually pay insufficient attention to the correctness and security of
application development, since application development is often driven by time-tomarket. According to [2], 68% of tested Web sites had Web application failures. Since
Web applications are highly dynamic and interactive, it is very difficult to test them.
According to the authors [5], most prior research on web application testing
provides a black box approach that uses crawler technology to find links in the
application and replay values for form inputs. This paper describes a “white box”
approach that analyses application source in order to get appropriate values and inputs
that are related to the database. This approach involves the following five steps: first, the
useful information, such as URL links, is extracted from application source. Following
this, it generates and simplifies the application graph. Third, some paths that correspond
to one or more test case are selected. Next, this approach uses AGENDA, a tool set for
testing relational database applications, to generate inputs. Along with the inputs, each
path constitutes a test case. Finally, the test case is executed automatically and checks of
the database states after each update or insertion [5]. The empirical evaluation
demonstrates that the white box approach is robust and efficient for large application
programs.
According to the authors, the white box approach can find some paths that may
not be found by black box approach. For example, the white box approach can find
dynamically generated URL links. Moreover, in the white box approach, the database
state is constructed to include many different situations. But this approach has one
disadvantage that it must target to a particular source language, such as Java.
1.3.2.4 Generating Web-Based Systems from Specifications
According to the authors, many web-based systems are poorly documented and
badly tested because of their complexity. These systems contain many errors in the
software. This paper describes several concepts and tools in order to make reliable webbased systems. The main idea of this paper is that the major parts of web applications can
be generated from specifications of functional requirements and users’ interactions. There
are four steps to generate a desirable system. The first step is to specify requirement that
is a set of function specifications and a list of type declarations. Next, a database design
such as Entity Relationship Diagram (ERD, a graph tool is used to identify entities,
constraints and the relationships between the entities) is to be generated. Following this,
it is to define the specification of navigation that uses a graph-based description to model
users’ interactions with the system (See Figure 1.6am) [6]. The last step is to generate
code.
Figure 1.6 Navigation Diagram
The concepts and tools of this paper are similar to what we have learned from
database development courses. This paper not only refreshes the knowledge that we have
learned but also tells us that good documentations are very important for systems. The
development of our system will be based on the following steps. First, we will have well
documented requirements analysis that contains system constraints, performance
requirement, and resource requirement. Second, after identifying client’s requirements,
we will generate the ERD and then convert it into database tables. Third, we will
implement our system by using PHP. We may not have the specification of navigation
since we are not familiar with it.
1.3.2.5 Structured Databases on the Web: Observations and Implications
This paper tries to measure characteristics of deep Web (potentially unlimited
information is hidden behind query interfaces) in order to explore hidden information
(those databases are not linked to HTML pages). It focuses on structured web-base
database and has two studies --- macro study (it surveys the deep Web at large such as
how many database are there and how many are structured databases.) and micro study (it
surveys source characteristics such as how complex are the query interfaces and the
queries). In macro study, after surveying the deep Web with one million random IP
samples, the author found 450,000 Web databases were measured, but the current
directory service can only cover 15.6% of these databases. On the other hand, “the micro
study surveys source-specific characteristics over 441 sources in eight representative
domains”, according to the authors. This means that the micro study is intended to
identify specific domain implications.
According to the authors, the structured databases, “which provide data objects as
structured relational records with attribute-value pairs”, are easier to find, as compared to
unstructured databases. The deep Web is hidden for some domains because these
domains usually do not support browse interfaces.
This paper gives us an effective way to access the deep Web. It can help users to
find the right sources and use queries in a right way. The traditional information
integration only focuses on small-scaled systems, but this new information integration
will focus on large-scale systems and will be very challenging. In order to make our
database visible, we are going to design a structured database and use more static data.
1.3.2.6 Web Mining Research: A Survey
Raymond Kosala and Hendrik Blockeel gave a very comprehensive overview and
survey of Web mining research. In their paper, they described the fundamentals of the
World Wide Web (Web) and the analysis of different research surveys regarding Web
mining. They suggested three Web mining categories and discovered the connection
between Web mining and related agent paradigms [8].
From the survey, the analysis team briefly discussed problems while interacting
with the Web. Users encountered low precision and low recall, irrelevant information and
created redundant knowledge out of the information available on the Web. Users find it
difficult to customize their information and in getting relative comments from consumers
and individual users. Web mining can solve all these problems by resource search,
information transformation, data generalization, and information analysis.
They also differentiate web mining among information retrieval (IR), information
extraction (IE), and machine learning. IR or document discovery is retrieving as many
relevant documents as possible. IE is the extracting of the relevant facts from selected
documents. Most IE systems focus on specific Web sites to extract. Web mining is the
combination of IE and IR. Machine learning is applied to the processes of Web mining to
provide support and help.
Web mining was categorized into three parts by the analysis team. They are Web content
mining, Web structure mining, and Web usage mining. Web content mining describes the
discovery of useful information for Web contents, and it has two different views – the IR
view and the Database (DB) view. Web structure mining focuses on the structure of the
hyperlinks within the Web itself. Web usage mining concentrates on the prediction of
user behaviors while users interact with the Web. Below is the table of three Web mining
categories in much detail.
Represe
ntation
Web Mining
Web Content Mining
Web Structure
Mining
IR View
DB View
- Unstructured
- Semi structured
- Links structure
- Semi structured
- Web site as DB
- Text documents
- Hypertext documents
- Links structure
- Hypertext documents
- Bad of words, n-grams
- Edge-labeled graph
- Graph
- Terms, phrases
(OEM)
- Concepts or ontology
- Relational
- Relational
- Proprietary algorithms
- ILP
- (Modified) association
rules
- Proprietary
algorithms
Method
- TFIDF and variants
- Machine learning
- Statistical (including
NLP)
- Machine Learning
- Statistical
- (Modified) association
rules
- Categorization
- Clustering
- Finding extraction rules
- Finding patterns in text
- User modeling
- Finding frequent
substructures
- Web site schema
discovery
- Categorization
- Clustering
- Site construction,
adaptation, and
management
- Marketing
- User modeling
View of
Data
Main
Data
Applica
tion
Categor
ies
Web Usage Mining
- Interactivity
- Server logs
- Browser logs
- Relational table
- Graph
Figure 1.7 Table of Web mining categories
Web mining, though appearing to be complicated in concept, is frequently used
nowadays. R. Kosala and H. Blockeel focused on representation issues and application
works, pointed out the confusion of Web mining and presented new challenges to the
traditional data mining algorithms [8].
1.3.2.7 Editorial: Special Issue on Web Content Mining
In their paper, Bing Liu and Kevin Chen-Chuan focused on Web content mining
to bring results together and encouraged more research activities in the field. They have
characterized the Web into a few ideas. According to their research, Web data or
information is huge, wide, diverse, various, easily accessible, and heterogeneous. Much
of the Web information is semi-structured, linked and redundant. The Web consists of
surface Web (HTML-based content information) and deep Web (database retrieve
information). The Web provides services and virtual societies, and the Web is dynamic.
They provided numerous studied research topics of Web content mining, which
are Structured Data Extraction, Unstructured Text Extraction, Web Information
Integration, Build Concept Hierarchies, Segment Web Pages and Detect Noise, and Mine
Web Opinion Sources [9]. For our project, we will combine the methods of Segment Web
Pages and Detect Noise and Building Concept Hierarchies. They present several
approaches for each research topic of Web content mining offered by different
researchers. They also present eight latest research papers in Web content mining to make
significant impact on real-world applications [9].
1.3.2.8 WEBKDD 2002 – Web Mining for Usage Patterns & Profiles
In this paper, B. M. Masand et al. provide a summary of WEBKDD 2002
workshops. He focused on Web usage Mining and divided the presentations into three
sessions. The first session is categorization of users and usage. This session focused on
addressing fundamentals of Web usage and classifying the user population into various
categories. Two papers have been issued [10]. The second and third session is predictions
and recommendations, which presented new approaches and new directions in
foundational issues of recommendation systems. Six papers have been stated, which are:

“Categorization of web pages and user clustering with mixtures of hidden Marko
models” – suggest hidden Markov models for patterning click streams.

“Web Usage Mining by means of Multidimensional Sequence Alignment
Methods” – offer Multidimensional Sequence Alignment Method (MDSAM) for
mining navigation patterns.

“A Prediction Model for User Access Sequences” – recommend a model for
sequentiality and personalization, which preserve sequence of click stream in the
antecedent and consequent, and measure of time gap between both.

“Coping With Sparsity in A Recommender System” – report experiment on using
Knowledge Pump (KP) recommender system.

“On the use of constrained association rules for web mining” – explore more
details in recommendation system.

“Mining WWW Access Sequence by Matrix Clustering” – explore in sequence
mining for Web data.
The fourth session is Evaluation of Algorithms. This session evaluates the
proposed algorithms and sees how effective they are. Two papers have been issued,
according to the [10].
1.3.2.9 A Web Personalization System based on Web Usage Mining Techniques
M. Albanese, A. Picariello, C. Sansone, and L. Sansone put their interest in “the
process of customizing the content and structure of web sites” [11], which is web usage
mining. Their innovation for personalizing the Web is by applying the two-phase
classification approach, using both user-provided data and browsing patterns, and
classifying users and contents. They first present pattern analysis and classification by
using an unsupervised clustering algorithm, Autoclass C, based on user-provided data.
They maximized the generalization capability of the systems. Following this, they
repeated the reclassification until they found a suitable solution accomplished by log
analysis and content management modules based on user behavior. The interaction can be
performed in three ways: queries (containing some keywords), searches (among
directories), and navigation (of the sites).
They tried their experiments by using a commercial web site called pariare.com,
which gives information about entertainment. The system will try to classify the users in
the following features:
1. Age
2. Sex
3. Category of places in which users prefer to go
4. Number of times per week in which users go out
5. Preferred day of the week to go out
6. The Pariapoli parameter (degree of interest)
7. Type of entertainment
Figure 1.8 Distribution of users among classes produced a) by Autoclass C at the time of
the last reclassification; b) by the lass run of the reclassification algorithm.
From the distributions above, features three and four have increased, which
showed an interesting solution that the system has a better user classification.
1.3.2.10 Assessing the Quality of Auction Web Sites
In this paper, S. J. Barnes and R. T. Vidgen introduced an instrument for
“assessing the quality of Internet sites from the perspective of the customer and the
context of the competition, which is WebQual [12]. Their goal is to evaluate the
usefulness and validity of WebQual instrument as s generic tool for web site quality.
They evaluated three auction sites: Amazon, eBay and QXL (a UK auction web site) in
terms of site design, interaction, information, and domain-specific qualities. For the
evaluation, the top ten questions are listed as below:
1. Feels safe to complete transactions
2. Personal information feels secure
3. Can be depended upon to deliver goods/ services promised
4. The site is easy to navigate
5. Provides accurate information
6. Has a good reputation
7. Has trustworthy sellers
8. Provide believable information
9. Provides relevant information
10. The site is easy to find
From their research, Amazon came out on the top in terms of user-perceived
quality. The reputation and advertisement about the auction sites are very important for
consumers in terms of attractiveness, reliability and security.
1.3.2.11
Designing
Multinational
Online
Stores: Challenges,
Implementation
Techniques and Experience
An online store is a virtual store. All users have their own shopping carts. Each
items they are selected will add to the current shopping chart. When the users finish
shopping, they should be able to check out. The checking out points will transfer all the
selected items as orders. Usually, the customers will be ask to fill some personal
information such as billing address, shipping address and credit card information. Once
the users finished checkout, an order information and confirmation send to the
computer’s Email address.
Online store is model includes one country to one language that can avoid the
current, payment, taxes confusion. When the store grows bigger, online store maybe
expand to different countries. If one county has more than one official language, the
online store should be design used to all the official language in that country. For
different countries, it is good to provide different language.
Online store model is a single store for each language and country. This approach
is very easy to implement. The drawback of this approach is that it is more costly to go
after a new market in a new country. A single multinational story is for all countries. This
approach is more difficult to implement. The business logic will change in order to
specify different counties order price this consideration will be include in the database
scheme. A single store multinational store is for a region. This is not mean to have a
single database to handle a region of countries. Different region will have different
server. Server is categories by different region. This can solve language issue, such as
Unicode, and other characters that conflicts will each others [14].
1.3.2.12 Customer-centered Rules for Design of E-commerce Web sites
This basic customer centered design e-commerce divide into different
components, which are Homepage, Navigation, categorization, Product information,
Shopping Cart, Check out and registration and customer service. Each component has
several rules [15].
For homage component, the web page should be clean and will not cluster with
text and graphics. The width of page should not less than the width the browser window
to avoid horizontal scrolling. For Navigation, Text on the links or buttons should be selfexplained and descriptive. When linking to another product related web site, link to exact
product page instead of the homepage of that site. For categorization, categorize products
in a way that is meaningful to regular customers. The depth of the categories should be
no more than 3. For product information, present accurate, consistent, and detailed
descriptions of products. Provide accurate and full pictures of products. Present the size
of product in a measurable and comparable way. Present the inventory information of
products in a measurable and comparable way. Present the inventory information of a
product in the beginning. Present products in a table with enough information to make a
purchasing decision such as prices and features for easy comparison. Present related
charges up front and in an accurate way. Same products shouldn’t be removed from the
page because of out-of-stock. For shopping cart, in the shopping cart page provide a link
that directs the customer back to the page he/she left for continuing the shopping. For
checkout and registration, only ask for necessary and meaning information such as name
and address. No marketing questions. For Customer service, provide a 1-800 number for
customer to call. Clearly state the return policy in a prominent place.
1.3.2.13 A Comparative Usability Evaluation of User Interfaces for Online Product
Catalog
There are many different catalog interface present on the web. The widely used
are hierarchically organized catalogs. The customers see a links of the similar product,
and then click the link to see the detail of the products. This is very confusion some
times, the better ways is provide each engine that users can input some data, and then the
search engineer will give some result back. This will eliminate the confusion of the
category which users don’t the product follow which categories. Search model includes
two model object search model and attribute search model. Object search model includes
the product model. For example, if search a car, you will enter car model. Attributes
search includes the price, year, etc. If search range is not specified by the user the max
number of result should be returned [16].
1.3.2.14 Effects of Scent and Breadth on Use of Site-specific Search on E-commerce
Web sites
There are several approaches to design web application. First is the cost benefit
approach, people will not only evaluate the quality of the product, but also will pay great
attention of the product prices [17]. This implied that we should organize the product that
seems the cost is minimized, but the benefit is huge. This will great effect the user’s
decision which increases the probability that customers are going to purchase those
particular products.
The second approach is attention capture [17]. If a search function is perceived as
a landmark by the user due to the brightness contrast of the text field with the background
of the page, then the user may perceive alternate way to view the categories. When
design the search engineer, when the user find our specific item, the more general item
are also should be able to specify. This mean if you want to search Air storm GTA
softball bat, the air storm GTA softball bat and softball bat should be appear.
1.3.2.15 The Dynamics of Mass Online Marketplaces: A Case Study of An Online
Auction
This article includes designed auction interface, market Techno structure design,
market navigation design, and items display design [18]. The interface for auction should
not only reflect how people transact but also how people interact with web application
virtual marketplace. The interface should have clear definition how the rule and
procedures for trading between sellers and buyers. Market Techno structure is a formal
procedures what control trade execution between participants. The entire rules should
have formal mechanism for market management to ensure all the people follow the
correct rule in the market. Market Navigation design is basic a search engineer enable
users to search the auction items more efficiently and quickly. For navigation people may
choose a breadth or a depth orientation for the navigation structure.
Item display design should display how the items are presented to the users and
where the items are located. The display should have text and graphics. Some related
information should be presented.
1.3.3 Literature Survey Conclusion
Along with the rapid growth of web-base database systems, many problems
related to the systems have appeared. For example, users cannot find satisfactory answers
by using form-based interface applications; form-based interface systems are difficult to
maintain because of their poor documentation; and a large amount of potential useful
information is hidden from users. Recent research indicates that many people are trying
to solve these problems by employing such effective methods as the imprecise query
approach, keyword-based queries, Web testing with AGENDA, and system development
tools. The imprecise query approach allows users to quickly find relevant answers
without refining their queries and it can be implemented without affecting the existing
database. The keyword-based queries method reduces the users’ work on inputting many
different queries. Web testing with AGENDA give us a new technique to extract useful
information in a Web database application. The system development tools help
developers to make reliable web-based systems. Most of these methods are effective and
they can help developers to design better web-based systems in the future.
Beside that, Data mining or Web mining act as an important element for obtaining
more accurate and efficient results. We learn the different types of Web mining and their
importance on different fields. By using Web mining, there are considerably big changes
on the data collection, which makes the system run more efficiently [4]. Quality control is
also a critical element for our project to keep our information remain secure and safe.
We will design multinational web application that can be used in different
cultures. Since our web application is only concerned with local business, a single
multinational pattern is for all countries suit our project. In addition, the breadth and
depth technology can be applied in order to organize the product categories.
1.4 Goals and Objectives
Our team members are good at database design, PHP, mySQL, Oracle, and
interface design. We are going to follow the database design process to design our
system. First, we will identify the entities that correspond to the application functions.
And then we need to figure out the relationships between the entities. Second, we are
going to draw an ERD to express the entities and their relationships. Since ERD
expresses the data structure of our system and it is the core part of the system, we will be
checking the ERD many times to make sure it is well designed (see figure 1.9, a sample
of ERD). Third, we will use mySQL to create tables based on the ERD and put the tables
into our database sever. In the meantime, we are going to load some fake data into the
tables for testing purposes. Finally, we will use PHP, a script language, to write the
system code and make the application interface interact with database properly.
Email
Phone
ERD of MU Bulletin Board
Cellphone
Street
Login_id
Password
City
Zip
Authentication
Contact
Information
Bpost_date
Pid
Login
Has
Book
Bcondition
Electronic
Uid
Author
User
Uname
Title
Furniture
Picture
House
load
Bprice
Car
Isa
Bid
Sell
Post
Seller
Book
News
Special
User
Buyer
Furniture
House
Buy
Car
Sell
Eid
Eprice
Sell
Buy
Buy
Electronic
Buy
Sell
Ename
Car
Post
Rent
Emodel
Cid
Ebrand
Furniture
Renting
Cprice
Eyear
Econdition
Rid
Epost_date
Rprice
Location
Available_time
Rpost_date
Fid
Cyear
Fprice
Cmodel
Fbrand
Cbrand
Fcondition
Fyear
Fpost_date
Milege
Ccondition
Cpost_date
Figure 1.9 A typical ERD
Our web-based database application system will contain three layers (See
following figure 1.10). The top part is user’s web browser that is an interface application.
Users will click some icons to find their desired information or user can fill some forms
to find a particular message. The middle part is application logic. We will use web
server-side scripting language (PHP) to make contact with the DBMS. The bottom part
consists of the DBMS and databases that contain different data.
System Diagram of MU Bulletin Board
Login Browser
Internet
Web Server
Book
Electronic
Furniture
Car
Renting
Ticket
Picture
news
Contact Informtion
mySQL/
Oracle
DBMS
Database
Figure 1.10 MU Bulletin Board system diagram
We plan to use dbms-unix.missouri.edu as our database server since both mySQL
and Oracle are available on this reliable and safe server. Since we are going to design a
new system, we have a few internal constraints to worry about. The internal constraint of
our project is the web server. We are using our own personal computer as a model to run
our web applications. The protection and security of our web server is our main concern
as if the web server is broken, everything is gone. On the other hand, from the literature
review above, the quality and security of our project is significant for us to attract more
users. Low reliability and security may result in project failure. We need to attract more
users to use our project and ensure their information secure in any ways. We also need to
consider some external environments such as the location of the software and external
links. For example, a buyer can send a message to a seller by clicking the listed email
address of the seller. This means that we need to consider the environment of users’ email
system. In addition, we are going to have a clear and easy to use interface. For example,
in the main page, we are going to have several categories such as used book function,
renting function, used electronic function, used furniture function, ticket function, job
finder function, news function, and bid function. If a user clicks the icon of used car
function, another window will pop out (See Figure 1.11, a similar window).
Figure 1.11 Example query interfaces
We are going to have 8 functions in our system. The used book function allows
users to post or remove their own information and it can return particular results specified
by users. For example, it can return all books that have price range from $40 to $60. The
renting function, used electronic function, used furniture function, ticket function, and job
finder function are similar to the used book function. The news function is a read only
function for regular users. This means that only authorized users can post information
through this function. The bid function is similar to eBay. A seller can put his/her item
on the Web site and allows other users to bid on the item for a period of time.
Expectations for the initial prototype are to have a working basic user interface
for user to buy or sell their items. The information such as used books, cars, electronics
(included Personal Computer), tickets, and announcements should be able to store into
different catalogues in database and share among users. Each item will be removed after
a period of time for expired data management. User information will also be stored in the
database for further contact and auction functionality. Future prototypes will be trying to
expand the basic functionality of the system. This includes creating a bid section in
different catalogues for users, adding additional functionality, expending the catalogues
of items and querying to the website.
Everyone is going to work on interface design, ERD design, implementation, and
documentation. We will try to design a clean interface that is easy to use. We also try to
design a well structured ERD in order to make an efficient system. Each team member
will be in charge of implementing 2 or 3 functions. After coding, we will use fake data to
test our system to see if the system will return desired answers. In the meantime, we will
write the documentation.
1.5 Overall Approach
Since our overall approach is based on established technology and many open
source applications, the relative cost of developing this project is low. PHP and MySQL
provide the advantage of open source and low cost, without sacrificing functionality.
Both of these free open-source software applications will reduce the time spent in
development as well as help to create a user-friendly, more designed web server. The
PHP language itself provides object-oriented server-side scripting that can be easily
modularized and reused. This will reduces the developing and testing time of the serverside application.
The major advantage to our desired solution is the reduction in time and cost. By
using our own personal computer as a web server and open source software or database
server of MU, the cost of our project has been cut drastically. The only cost of our project
is the wage of labors in developing the web services.
Our goal is to bring users a clean, useful, convenient, safe, and robust Web site.
But we face several problems for this project. First, we have limited time. We may not be
able to provide good software documentation because of the time constraint. Second, we
may not be able to test the system in an effective way since we do not have real data but
fake data. Third, the system will be removed after this semester because the server
account is from a team member who takes a database course this term.
Requirement Analysis of MU Bulletin
Board
Team
Lim, Carol Teng Yik
(913166)
Yi Wang
(838550)
Lei Wen
(884241)
Mentor
John Boyer (Programmer/Analyst of MSA & Student Life)
April 7, 2005
2.1 Introduction
The MU Bulletin Board is initially designed for the people who live in Columbia,
Missouri. It provides the convenience to users’ daily life in the local area. So, it is
important for the MU Bulletin Board to have the desired attributes that would help users
avail this convenience. In order to achieve this goal, our system should meet many
requirements such as system constraints, performance metrics, and resource usage. In
addition, we will provide alternative solutions, testing methods and scheduling of the
project.
2.2 Overall Description
Our requirement analysis contains five parts. First, this document describes the
system requirements and constraints in section 2.3. In this section, we discuss the
operating environment, market users and characteristics, environmental constraints,
system components, software interfaces and libraries, and system maintenance. Second,
this document describes the performance requirements of the system in section 2.4.
Third, in section 2.5, this document describes the resource requirements on time,
resources, facilities, and budget. Fourth, this document describes an alternative solution
for our system. Finally, this document describes the evaluation metrics of our system.
2.3 System Requirements and Constraints
2.3.1 Operating Environment (External Constraints)
All users should have at least one valid email account because we need email
communication in our system. For example, a buyer sends a message to a seller by
clicking the listed email address of the seller. This means that we need to consider the
external link that is the users’ email system. Another external system constraint is the
location of our software. Our SQL code will be stored in the MySQL database account of
MU. For the purpose of the project, our system code will be stored temporarily in one of
our team member’s Bengal/database account.
All users should be able to access the Internet to interact with our Web site since
our system is web-based database system. Users’ web browsers should support CSS, PHP
and JavaScript. Compatible web browsers include Internet Explore 6.0, and Netscape 5.5.
We will use the MU software system to implement our system.
We will use Linux system (EBW Computer Lab) of University of Missouri to
implement our system. Our system includes three major components: web server,
relational database, and server-side scripting language. We will use Apache web server,
MySQL database, and PHP sever-side scripting language. Apache is a Unix-based, opensource web server that is used to host most sites on the Internet. MySQL database is a
multi-user, multi-threaded SQL (Structured Query Language) database server and is a
client/server implementation that consists of server daemon and many different client
programs/libraries. PHP, a sever-side HTML embedded scripting language, is used to
create dynamic web pages. A dynamic web page is a page that interacts with the user, so
that each user visiting the page sees customized information. PHP is freely available and
used primarily on Linux (UNIX) web servers.
2.3.2 Market Users and Characteristics
Since our system only deals with local business, we do not have the shipping and
handling function in our system. A buyer can meet the seller in a local place for
bargaining the price or picking up the item. This kind of local item trading can save a lot
of money for buyers on shipping and handling, especially for large items. In addition,
since our system is free, it will attract many users to trade through our system. So our
database will contain a lot of data (more choice) and people can find their needs easily.
There are many similar systems available online, such as eBay and Amazon, but users
have to pay extra money for their services, handling and shipping.
Our system will exactly meet the customer’s requirements, which are following:
1) All customers should be able to register as new users.
2) All users should be able to login and logout.
3) All users should be able to view or add new announcements or activities, such
as yard sale activities.
4) All users should be able to post new selling item information.
5) All users should be able to delete their sold item pages or information.
6) All users should be able to view items by category.
7) All users should be able to search items by criteria.
8) All users should be able to delete old posted messages.
9) All users should be able to upload pictures.
10) The administrator should able to delete and block users (a fine level of control
over user activities).
11) The administrator should have more functions than normal users, such as
school event announcements.
12) Users will be automatically deleted, if users did not login for 6 months.
2.3.3 Environmental Constraints
Our team consists of three members. Everyone has the requisite experience in
database design, PHP, MySQL, Apache database management and interface design.
Everyone is assigned to work on interface design, system design, implementation, and
documentation, due to the different interests. If one of the team members cannot finish
his/her job because of illness or withdraws from the course, other team members will
pick up that member’s job quickly. Our team members use email and phone to discuss
our project and we have meetings every week for improving and discussing our project.
Besides human factors, our system has environmental constraints on quality, reliability,
safety, and suitability.
Quality: MU Bulletin Board is designed as a website that Internet users will be
familiar with. When users login to the MU Bulletin Board, they can start navigating the
website. If users do not understand the function on the website, users can look for the
help link in MU Bulletin Board. The user guide will be in the help link for user
references.
Reliability: When the user enters data into MU Bulletin Board, the data needs to
be stored and retrieved correctly. When a different user is using the MU Bulletin Board,
that user cannot simply view previous user’s page. For example, John logout the MU
Bulletin Board after input some data in his account, but he forgot to close the browser.
Suppose after John left, Mike wants to enter John’s account by clicking the ‘back’ button
on the browser, but he will fail. Our database will delete users who do not use our
database for over six month.
Safety: Users of the MU Bulletin Board are asked for the login name and
password every time they login. The password is encrypted. The data question will only
be available to the user who is authorized. MU Bulletin Board will auto logout if the user
is idle for longer than 5 minutes. All the inputs will be validated. Attempt to enter HTML
tags and JavaScript in textboxes and text areas will be detected by the system.
Suitability: MU Bulletin Board is built under the Web server and database server
of University of Missouri. The servers operate and support the Web sites and databases in
24 hours. So, as long as the user gets internet access, the user will be able to login to MU
Bulletin Board at anytime.
2.3.4 System Components
We will use the Spiral Model to develop our system. We will define and
implement the highest priority features, then get feedback from users. With this
knowledge, we will then go back to define and implement more features. The spiral
model, illustrated in Figure 2.1, combines the iterative nature of prototyping with the
controlled and systematic aspects of the waterfall model, therein providing the potential
for rapid development of incremental versions of the software. In this model the software
is developed in a series of incremental releases with the early stages being either paper
models or prototypes. Later iterations become increasingly more complete versions of the
product.
Figure 2.1 Spiral Model
Our web-based database application system will contain three layers (See
following figure 2.2). The top layer is the user’s web browser that is an interface
application. Users will click some icons or links to find their desired information or the
user can fill some forms to find a particular message. The middle layer is application
logic which contains 9 functions. We will use a web server-side scripting language (PHP)
to contact with the Database Management System (DBMS). The bottom layer consists of
the DBMS and databases that contain different data.
System Diagram of MU Bulletin Board
Login Browser
Internet
Web Server
Book
Electronic
Furniture
Car
Renting
Ticket
Picture
news
Contact Informtion
mySQL/
Oracle
DBMS
Database
Figure 2.2 MU Bulletin Board System Diagram
2.3.5 Software Interface and Library
We will have a clean and clear interface that is very easy to use (see Figure 2.3, a
sample interface). The interface has all the user needs, such as news announcement,
search engine, and item catalogues. The login link is on the left bottom of the interface. A
user can view his/her personal information by typing his/her user ID and Password. We
will use mail() function in our system. This function allows user to send an email. Mail()
function uses the SMTP protocol, and the SMTP server to send email. We will provide
forms for users to post their selling item information. We will use MySQL commands,
such as CREAT TABLE, SELECT, INSERT, UPDATE, and DELETE, to create tables
and carry out users’ requests. The MySQL queries will be embedded into HTML files by
using PHP syntax such as echo.
Figure 2.3 A Interface Sample
2.3.6 System Maintenance
The administrators should be able to delete incorrect information and check the
stability of the system every week. The system will provide an email link in order to help
those users who have problems. For system maintenance, the system administrator will
update the system every three months. For example, the administrator can add a new
function into the system and delete the users’ old data.
2.4 Performance Requirements
In our project, we use MySQL database of University of Missouri. If our database
contains too many data (more than 1000 users for example), the MySQL databases will
respond to the users’ requests slowly. In order to solve this problem, the indexing
technology will be used. We will index certain tables, in order to accelerate query speed.
When users navigate the MU Bulletin Board, the websites need to load within 5
seconds. User should not have to wait longer than 5 seconds to get the table content or
uploading files. The server should respond in a short period of time. Database queries
needs to be simple in order to avoid delays in retrieving data.
2.5 Resource requirements
The system will need one month to implement. Our resources are all from MU.
Our database account is provided by a database course. As stated above, we will use
Apache (an open-source web server that is used to host most sites on the Internet),
MySQL (a database server), and PHP (a sever-side HTML embedded scripting
language). We will also use Gantt charts to monitor our project (see Figure 2.4). Our
database account will provide all the facilities. In addition, since our project is part of
course requirement, our budget is zero.
Jan 2005
ID
Task Name
Start
Finish
Feb 2005
Mar 2005
Apr 2005
Duration
1/23 1/30 2/6 2/13 2/20 2/27 3/6 3/13 3/20 3/27 4/3 4/10 4/17 4/24 5/1
1
Forming group and choose project
1/19/2005
2/1/2005
10d
2
Project Definition
2/9/2005
3/8/2005
20d
3
Requirement Analysis and Literature Review
3/2/2005
3/23/2005
16d
4
Design Specification
3/30/2005
5/3/2005
25d
5
Project implement, testing and V&V
2/1/2005
5/2/2005
65d
6
Final Documentation
5/3/2005
5/9/2005
5d
Figure 2.4 Gantt Chart Sample
2.6 Alternative solution
There are several alternative solutions such as IIS (Internet Information Service),
SQL server 2000, and ASP.NET. ASP.NET is a powerful scripting language for
developing web applications. ASP.NET must run on the IIS which is part of Windows
XP professional and Windows server 2003 operating system. Running IIS in windows
XP professional will cause a poor database management result. Favorable results are
obtained when using windows server 2003. In addition, we need Visual Studio .NET if
we write code of ASP.Net. Since the computer labs of University of Missouri do not
provide these software and our team does not have money to buy these software, we
cannot use windows server 2003 and Visual Studio .NET in our project. Furthermore,
most of our team members are not familiar with all the alternative solutions. It will
require more time for us if we were to use the alternative solutions.
2.7 Evaluation metrics
Our system will allow 1000 users to access the database at the same time. When
users are navigating the MU Bulletin Board, the websites need to be loaded within 5
seconds. Each user has maximum 3M storage to upload files and pictures because we do
not have much room in our database account. In addition, we are going to make some
queries and upload some pictures to test our system.
Design Specification of MU Bulletin Board
Team
Lim, Carol Teng Yik
(913166)
Yi Wang
(838550)
Lei Wen
(884241)
Mentor
John Boyer (Programmer/Analyst of MSA & Student Life)
April 20, 2005
3.1 Introduction
The MU Bulletin Board is developed as a combination of PHP scripts, HTML,
and SQL queries. For the software tools, our team will use the VI editor, a screen-based
text editor used by many UNIX users, Adobe Photoshop, and the SecureCRT. HTML
code is mainly used for interface design. PHP scripts are used to control the dataflow of
the website in the server. It helps MU Bulletin Board in connecting to the database and
restricts the appearance of the HTML code. The SQL database language is used for
querying the data that are stored in our database. It will access and retrieve the relevant
data in a short period of time. In the section on requirements analysis, we provide nine
functions to be used in our design. Now we will convert those functions into a set of
prototypes expressed in the PHP scripting language.
3.2 System Design Overview
Our system design specification consists of five parts. First (section 3.3), we will
provide the specification of the data requirements. In this section, we will explain the
methodology of data collection and storage, data format, and the database. Next (section
3.4), we will provide the details of software design such as diagrams, pseudo code, and
Web layout hierarchies. Third (section 3.5), we are going to provide the testing method
for our system. Following this (section 3.6), in addition to the scheduling diagram that we
have provided in the section on requirement analysis, we will provide a task assignment
for each team member. Finally (section 3.7), we are going to roughly estimate the cost of
our design implementation.
3.3 Data Requirements
Since we do not have real data, we are going to input some testing prototype data
into CSV files, and then load the data into our database in order to test and interact with
our system. A CSV (comma separated value) file is often used to exchange data between
disparate applications. In a CSV file, a record separator may consist of a line feed or a
carriage return. For example, “Mike, Green, 100 Forum St., Columbia, MO, 65202” is a
record in a CSV file.
We will use MySQL, the world’s most popular open source database management
system, which is located under the dbms-unix.cecs.missouri.edu server. MySQL is a
relational database management system that uses Structured Query Language (SQL), the
most popular language for adding, accessing, and processing data in a database. The most
important reasons for our choosing MySQL are its award-winning speed, scalability and
reliability. In addition, it is economy because MySQL is open-source software.
According to the lecture notes of Database Management System I, the size of one I/O of
MySQL server is 4KB.
3.4 Software Design
There are five steps in our software design phase. First, we came up the system
diagram as showed in the section on requirements analysis. Second, we provided the
ERD of our system. We improved our ERD in the software design phrase (see figure 3.1).
In order to understand the data flow of the system, we will provide the data flow model in
this section (see figure 3.2). Third, we design and modify our interface diagram (see
figure 3.3.1~3.3.3). Fourth, we create tables in our database by using SQL. We provide a
small part of the code (see figure 3.4). Finally, we use PHP and HTML to implement our
system. We provide pseudo code for this part (see figure 3.5).
Fname
ERD of MU Bulletin Board
Phone
Cellphone
Lname
Street
Password
Email
Uid
City
Uidentity
Zip
Contact
Information
Has
User
Bid
Nid
Bpicture
ISBN
Post
NDescription
Bpost_date
Caid
Bcondition
Norganization
Author
Btitle
Caname
Catalogue
Ntime
News
Bprice
Nlocation
Has
Book
Has
Ntitle
Has
Ndate
Has
Has
Npost_date
Eid
Eprice
Electronic
Furniture
Ename
Car
Ticket
Emodel
Cid
Fid
Ebrand
Ctransmission
Tid
Cprice
Eyear
Fprice
Tprice
Cmodel
Econdition
Cyear
Fbrand
TLocation
Milege
Epost_date
Cbrand
Fcondition
Ttime
Cpost_date
Epicture
Ccondition
Fyear
Tdescription
Cpicture
Fpost_date
Tdate
Fpicture
Figure 3.1Updated ERD
MU Bulletin Board Data Flow Model
Return result to user
Return the requested data
External Entity
Data Store
Process
User request queries
Queries request data
Figure 3.2 System Data Flow Model
Figure 3.3.1Interface of the Front Page
Figure 3.3.2 Interface of the Normal User Book Function
Figure 3.3.3Interface of Search Result
CREATE TABLE Book (
Bid INT NOT NULL,
ISBN INT,
Bcondition VARCHAR (100),
Btitle VARCHAR (20) NOT NULL,
Bprice DOUBLE NOT NULL,
Author VARCHAR (50),
Bpost_date DATE,
Bpicture VARCHAR (100),
Caid INT NOT NULL,
CONSTRAINT PK_Book PRIMARY KEY (Caid, Bid),
CONSTRAINT FK_Book FOREIGN KEY (Caid)
REFERENCES Catalogue (Caid) ON DELETE CASCADE);
Figure 3.4 SQL query code
$link = mysql_connect("dbms-unix.cecs.missouri.edu", "yw7q6", "******")
or die("Could not connect database: " . mysql_error());
mysql_select_db("yw7q6") or die("Could not select database yw7q6");
$query1 = "select Btitle,Bprice,Author,ISBN,Bpicture,Bpost_date from
Book";
$result1 = mysql_query($query1) or die("Query1 failed : " .
mysql_error()); // process the query
echo"<br><br><table border = 1>\n"; // return the result
echo"\t<tr>\t\t<td>Title</td>\t\t<td>Price</td>\t\t<td>Author</td>\t\t<
td>ISBN</td>\t\t<td>Picture</td>\t\t<td>Post Date</td>\n";
while ($line1= mysql_fetch_array($result1, MYSQL_ASSOC)) {
echo "\t<tr>\n";
foreach ($line1 as $col_value) {
echo "\t\t<td>$col_value</td>\n";}
echo "\t</tr>\n";}
echo "</table><br><br>\n";
mysql_free_result($result1); mysql_close($link);
?></center></body></HTML>
Figure 3.5 Sample HTML Pseudo code
We encounter two major difficulties during the design specification phase. First, it
is hard to communicate with each team member. Since some team members do not check
their email very often, we (the team) cannot make our decisions right way. The
communication problem makes our design process slow. Second, we faced the problem
of obtaining permission for using a MySQL account. Since the database account is for a
database course, the TA of the course informed us that we were not allowed to use the
database account for another course (we got a meru account from our capstone course,
but this account does not support database). Fortunately, after reporting to the instructor
of the database course, we are allowed to use the MySQL database account. In addition,
after obtaining the requisite permission, we spent three days trying to solve the password
problem of out database account. The second problem delayed our implementation
process. Since our system does not involve any hardware components and our project is
limited to software design and implementation, we do not have any hardware design and
implementation.
3.5 Testing Methods
Since we have two types of users --- administrator and regular users, we will have
two types of tests. Basically, the testing steps are same for both types of user. Since
regular users have fewer functions than administrators, we only provide the testing
methods for administrator.
Login as Administrator
1. Test whether a correct menu pops out after an administrator login
2. Create several new users (1 Administrator and 2 User). Fill in all required fields.
View the corresponding tables and see whether the data is inserted correctly.
3. Delete a user. Check whether the user is deleted from the user table.
4. Update the user’s information. Check whether the information in the user table is
updated.
5. Try to login as a new user that has just been created and test whether the new user
can logout successfully.
6. After login, test each function one by one. For example, fill in all the required
fields in the Book function. View the Book table to see whether the data is inserted
correctly.
7. Delete users’ data in the Book function. Check whether the data are deleted from
the Book table.
8. Update users’ book information. Check whether the information in the Book table
is updated.
9. Go to the Book search. Try to search for the book that has just been inserted.
Check whether it can be searched.
10. Do the same steps from 6 to 8 for other functions
11. Go to the upload and download files function. Try to upload a file to the web
space. Check whether it appears on the download column. Click the files to verify
that it is same as the file that you uploaded. Try to delete a file from the web space
and check whether it is deleted.
12. Logout and click the Back button on the browser to check if the user’s
information is invalid.
3.6 Scheduling Diagrams with Task Assignments
The following Gantt chart represents a tentative schedule. As future requirements,
modifications, and complications arise, this schedule will be subject to change. Basically,
each team member is in charge of implementing three functions, and we will implement
the functions in parallel.
Jan 2005
Task Name
ID
Start
Finish
Feb 2005
Mar 2005
Apr 2005
Duration
1/23 1/30 2/6 2/13 2/20 2/27 3/6 3/13 3/20 3/27 4/3 4/10 4/17 4/24 5/1
1
Forming group and choose project
1/19/2005
2/1/2005
10d
2
Project Definition
3
Requirement Analysis and Literature Review
2/9/2005
3/8/2005
20d
3/2/2005
3/23/2005
16d
4
Design Specification
3/30/2005
5/3/2005
25d
5
Project implement, testing and V&V
2/1/2005
5/2/2005
65d
6
Final Documentation
5/3/2005
5/9/2005
5d
Figure 3.6: Gantt chart scheduling
3.7 Design Implementation Costs
As stated in the problem definition, our desired solution is reduction in time and
cost. Since we are doing volunteered website design, we plan to lower any additional
costs. By using open-source software (PHP, MySQL, Apache database), we have
drastically cut the costs of our project. Moreover, we use dbms-unix.cecs.missouri.edu
database web server, which belongs to the school server. This has reduced our costs for
web server installation and maintenance. Generally, our project design implementation is
costless.
4.1 Introduction and System implement Overview
Stated in Design Specification, this system will have two parts – database and
web interface. We (the team) will implement MU Bulletin Board using VI editor,
SecureCRT, Adobe Photoshop, Microsoft Office and etc. Since we only have three weeks
to implement this system due to some reason, we will implement part of the system and
see how the prototype works.
Our system design specification consists of five parts. First (section 4.2), we will
provide the system initialization and achievement of database table. Next (section 4.3),
we will talk about the basic web user interface. Third (section 4.4), we are going to
provide you the basic user login/logout and register function. Following this (section 4.5),
we will provide you two similar functions which are Books and Car function. Finally
(section 4.6), we are going to talk more about news function.
4.2 System Database Implementation
As our temporary database will be used in dbms-unix.cecs.missouri.edu, most of
the initialization of database has been done by the school. Since we do not have any
permissions to change anything about database server, we plan to go ahead to create our
system database table which is [19]. For testing purpose, we are going to input some
prototype data into the system for performance testing.
4.3 System Basic Interface
At first, we design the first interface (figure 4.1) after our first group meeting.
Afterward, we realized that the first interface is too simple and lack of future
development. After doing some web research, we decided to combine both famous
website interface (Ebay.com and Yahoo.com) and come out our current interface (figure
4.2). After interview with a few people, they felt satisfied with our interface and wished
that we could modify the font and picture size in the future.
News
figure 4.1 System first interface
figure 4.2 System second and current Interface
4.4 System User login / logout and register functions
This system will have different types of users --- administrator, regular users,
webmaster and new users. For prototype purpose, we will create a regular users login /
logout function and new user register function. The first part will be new user register
function and next one will be regular user login / logout function. This system will not
have any recognized function among different type of users at least the user is new.
4.4.1 User Register Function
Figure 4.3 User Register Page
In order to create a new account, the system have to have all some information
about the user, such as user name, password, first name, last name, email and etc. If the
user name has been used, the system will be return and told the user to choose the other
one. If system found the same email, the system will ask user to key in another email.
Below is the pseudo code of the register function.
If (user submits the information)
If (user name is same from database)
Return and ask user to key in another user name
Else if (email is same from database)
Return and ask user to key in another email
Else
Save the data into database
Else
Display the form for user to key in database
Figure 4.4 User Register Pseudo code
4.4.2 User Login / Logout function
For user to login, user needs to enter the correct user name and password. Once
the user login, the cookies/sessions are being set and user can be logout anytime. Once
user logout, the cookies/sessions is being reset. Below is the pseudo code of login
function.
If (user already login)
Return User Error
Else
If (user has submit the information)
If(user name and password is incorrect)
Return and ask user to key in the correct user name and password
Else
Set the cookies/session and save the data into database
Save the data into database
Else
Display the form for user to key in to login
Figure 4.5 User Login Pseudo code
4.5 Sale function
We have different categories of sale items. For prototype purpose, we will create
Book function and Cars function due to their popularity of users. Both functions are
similar yet different. We will pick the best one according to user preferences.
4.5.1 Book function
Our book category has view, post, update, and delete functions. First we create
the Book table in the database and insert some fake data into the table. Then, we write
HTML and SQL codes that have been embedded into PHP codes (see figure 4.6).
Set global variable
Connect to MySQL database
Request queries: select all books in the database
Process the queries
Return the result of the queries to user interface
Figure 4.6 pseudo code of Book view
4.5.2 Cars Function
Car functions include Search cars, post a new car, update and delete existed car
functions. Search car function allows users to search cars information based on different
criteria such as vehicle make, car type, transmission type, model type, and number of
doors. Post a new car function allows users to add new car information to the database.
Update car function allows users to update the information that user previously inserted.
Delete car function allows users to remove old car information from database.
Include nessccary library
Connecting the MySQL database,
Request query, such as Insert, update, delete.
Using php to range the data
Output the data in html format.
Figure 4.7 basic pseudo code of Car function
4.6 News function
Figure 4.8 View All News page
The News function will have three parts. One (figure 4.8) is view all news which
the system will pick out the available news from the database and arrange them into date
order. The first one will be the coming one and the rest will be following by that. Second
(figure 4.9) will be post news messages. The system will preview the message once
before save the data into database for user convenient in case of some careless mistake in
the information. The last one (figure 4.10) is view one news which it will display all
information about the new which user wants. Beside that, the system has a RSS (Really
Simple Syndication) function which user can view the news like those online news
website (CNN.com). Figure 4.11 will be the xml files generated by PHP scripts and data
from database. Most of the functions’ codes are similar, just that the display codes are
different due to their own use.
figure 4.9 Post News Page
figure 4.10 View one News
figure 4.11 XML RSS feed page
5.1 Introduction and System Overview
As the system prototype has been initiated and implemented, we will test its
performance and found out any errors in the system stated as in the design specification.
First (section 5.2), we will provide the system performance and achievement after our
testing. Then (section 5.3), we run the system into a few other testing for better results.
Lastly (section 5.4), we will also have some user evaluations for the system prototype to
know what a user really wants in our project.
5.2 System Performance
When users navigate the MU Bulletin Board, the website is load within 5 seconds
due to the fact the database server is stable and available. The server is respond in a short
period of time. Database queries are simple but there are more than four queries in one
page which will have delays in retrieving data if we have multiple users. Our server is
used for a database course and many users may connect it at same time; this has delayed
the time of saving and retrieving the data stored in the database (as we presented in the
presentation).
Beside the speed, the safety and security of the server is unstable. By using our
own hacking system, we will be able to crash our own system without any difficulties.
The reason of that is due on the JavaScript error checking that we have in our PHP
scripts. Hackers will be able to disable the JavaScript function and enter any kind of data
to crash our database. The best solution of that will be having a server-side error checking
and SSL secure connection which we will discuss more on future work.
5.3 System Testing
Our testing procedure includes individual testing, system integrate test and
inspection. The individual testing started when we are implemented the system. In
individual testing, all input fields are validated by the system by using JavaScript
function and simple PHP function. System integrate testing occurs when we are combine
all our function into one working system. We add components one by one to ensure each
component interacts with each other in the correct way. In inspection process, one of us
will look through all the codes to find any common mistakes.
Testing plan:
input test :
Text field:
1. enter random number into the text field
2. enter random character into the text field
3. enter empty string into the input text field.
4. enter JavaScript into the input text field.
5. enter html tag into input text field.
Dropdown box:
1. select the items in the dropdown box to check if the correct
value is selected.
2. Down the web page that contain dropdown box, change html tag
<select name=bb><option value= 10000>. In this case, 10000
will be pass to the server.
Session test:
1.login the web site. Go though each page to check if all the page is direct
to the correct location.
Inspection:
1. check looping process out of array boundary
2. check variable type in used
3. check any variables are not used
4. check web page redirect the correct place
Figure 5.1 Testing Plan
5.4 System Evaluation
Since we have limited of time to implement and integrate the system, we do not
have enough time for user evaluation. By assumption, we assume that the user will be
satisfied our system. We will have user evaluation right after we have a stable web
server.
6. Conclusions and Discussion
of MU Bulletin Board
version 1.0
Prepared by: Carol Teng Yik Lim
Team Name: MU Bulletin Board
Mentor: John Boyer
May 08, 2005
6.1 Introduction and Overview
Fortunately, our first prototype is completed within the time frame. Although the
MU Bulletin Board is not fully working, we are satisfied with the system. First (section
6.2), will have different conclusion by each member of the team. Then (section 6.3), will
be the problem that we have faced during the implementation and integration of the
system.
6.2 Conclusion
6.2.1 Wang Yi’s conclusion
After going through the whole process (system analysis, requirement analysis and
design specification) of the project, I knew every phase is important in order to have a
successful project especially in design phase. System developers should focus on many
detail things during the ERD design in order to avoid later implementation problems.
6.2.2 Wen Lei’s conclusion
In my conclusion, the security is very import for web application. Always enable
server side validate will be great decrease the chance to get hacked. Communication
between programmer and designer to ensure the correct function is implemented.
6.2.3 Carol Lim’s conclusion
In my opinion, I learn that every phase in the whole process is very important in
designing and implementing a new project. In this project, although we have limited time
in implementing the project, we learn a lot about web application and E-business.
Security, availability and reliability are the three main issues for a success web
application. User evaluation is also important to attract more users to use this project.
6.3 Discussion
We are facing some problems and eventually, we solved it within the time limit.
Due to the fact that we have a late permission using the MySQL Server (Database
Server), we have limited time to implement our system. This has slowed our process
down and we ought to change our schedule and our prototype to be simpler. Although we
have some working PHP scripts working well in other servers, we found out there are
some difficulties to transfer the working PHP scripts into our server by reason of different
policies of the server. As result, we have to redesign the code and implement by
ourselves which has minimized our functional system in our first prototype.
Beside that, we have some communication problem. We can not have a proper
meeting during the day or night because some of the member has to be work or take care
of their families. The only time that we are available is when we are in the capstone class,
which is not enough. Although we have email contact frequently, we are facing some
integrate problems when we combine the individual function into one systems.
Fortunately, one of our member has created a general library files which as faster the
speed of system integration.
7.1 Future Work
We will contact our mentor and ask him about his opinion about our system. It
will be great if he want to adopt our system. If this happens, we will finish the all remain
functions and add some useful functions such as administrator functions since we are
using a stable permanent database web server. If not, we will try to find a stable web
server and finish remains functions. We will have a feedback function for users in order
to get more user requirements. We may improve our interface and make it clearer and
user friendly. The most important function after we host the system is to improve the
security of our system. We can add more server-side error scripts instead of client-side
JavaScript and secure connection to database server.
Database Create Table List
CREATE TABLE User (
Uid INT,
Password VARCHAR(20),
Uidentify VARCHAR(20),
Name VARCHAR(30),
CONSTRAINT PK_User PRIMARY KEY (Uid)
);
CREATE TABLE Catalogue (
Catalogue_name VARCHAR(50),
CONSTRAINT PK_Catalogue PRIMARY KEY (catalogue_name)
);
CREATE TABLE Electronic (
Eid INT,
Uid INT NOT NULL,
Catalogue_name VARCHAR(50) NOT NULL,
Eprice DOUBLE NOT NULL,
Ename VARCHAR(30) NOT NULL,
Emodel VARCHAR(20),
Ebrand VARCHAR(20),
Eyear YEAR,
Econdition VARCHAR(100),
Epost_date DATE,
Epicture VARCHAR(100),
CONSTRAINT PK_Electronic PRIMARY KEY (Eid, Catalogue_name),
CONSTRAINT FK_Electronic1 FOREIGN KEY (Catalogue_name)
REFERENCES Catalogue ON DELETE CASCADE,
CONSTRAINT FK_Electronic2 FOREIGN KEY (Uid)
REFERENCES User ON DELETE CASCADE
);
CREATE TABLE Book (
Bid INT,
Uid INT NOT NULL,
ISBN VARCHAR(20) NOT NULL,
Bcondition VARCHAR(100),
Btitle VARCHAR(20) NOT NULL,
Bprice DOUBLE NOT NULL,
Author VARCHAR(50) NOT NULL,
Bpost_date DATE,
Bpicture VARCHAR(100),
Catalogue_name VARCHAR(50) NOT NULL,
CONSTRAINT PK_Book PRIMARY KEY (Bid,Catalogue_name),
CONSTRAINT FK_Book1 FOREIGN KEY (Catalogue_name)
REFERENCES Catalogue ON DELETE CASCADE,
CONSTRAINT FK_Book2 FOREIGN KEY (Uid)
REFERENCES User ON DELETE CASCADE
);
CREATE TABLE Ticket (
Tid INT,
Tprice DOUBLE NOT NULL,
TLocation VARCHAR(50) NOT NULL,
Tdate DATE NOT NULL,
Ttime TIME NOT NULL,
Tdescription VARCHAR(100),
Catalogue_name VARCHAR(50) NOT NULL,
Uid INT NOT NULL,
CONSTRAINT PK_Ticket PRIMARY KEY (Tid, Catalogue_name),
CONSTRAINT FK_Ticket1 FOREIGN KEY (Catalogue_name)
REFERENCES Catalogue ON DELETE CASCADE,
CONSTRAINT FK_Ticket2 FOREIGN KEY (Uid)
REFERENCES User ON DELETE CASCADE
);
CREATE TABLE Furniture (
Fid INT,
Fprice DOUBLE NOT NULL,
Fitem VARCHAR(50) NOT NULL,
Fcondition VARCHAR(100),
Fyear YEAR,
Fpost_date DATE,
Fpicture VARCHAR(100),
Catalogue_name VARCHAR(50) NOT NULL,
Uid INT NOT NULL,
CONSTRAINT PK_Furniture PRIMARY KEY (Fid, Catalogue_name),
CONSTRAINT FK_Furniture1 FOREIGN KEY (Catalogue_name)
REFERENCES Catalogue ON DELETE CASCADE,
CONSTRAINT FK_Furniture2 FOREIGN KEY (Uid)
REFERENCES User ON DELETE CASCADE
);
CREATE TABLE CarBrand(
Brandid INT,
BrandName VARCHAR(50),
CONSTRAINT PK_CarBrand PRIMARY KEY (Brandid)
);
CREATE TABLE CarType(
Typeid INT,
TypeName VARCHAR(50),
CONSTRAINT PK_CarType PRIMARY KEY (Typeid)
);
CREATE TABLE Car (
Cid INT,
Uid INT NOT NULL,
Brandid VARCHAR(20),
Typeid VARCHAR(20),
Catalogue_name VARCHAR(50) NOT NULL,
Cprice DOUBLE NOT NULL,
Cmodel VARCHAR(20),
Milege INT NOT NULL,
Cpost_date DATE,
Cpicture VARCHAR(100),
Ctrasmission VARCHAR(50) NOT NULL,
Cyear YEAR NOT NULL,
Ccondition VARCHAR(100),
CNumDoor Integer,
CONSTRAINT PK_Car PRIMARY KEY (Cid, Catalogue_name),
CONSTRAINT FK_Car1 FOREIGN KEY (Catalogue_name)
REFERENCES Catalogue ON DELETE CASCADE,
CONSTRAINT FK_Car2 FOREIGN KEY (Uid)
REFERENCES User ON DELETE CASCADE,
CONSTRAINT FK_Car3 FOREIGN KEY (Brandid)
REFERENCES CarBrand,
CONSTRAINT FK_Car4 FOREIGN KEY (Typeid)
REFERENCES CarType
);
CREATE TABLE News (
Nid INT,
Ntitle VARCHAR(50),
Nlocation VARCHAR(50),
Npost_date VARCHAR(15),
Ndescription TEXT,
Uid INT NOT NULL,
Ntype VARCHAR(25),
Ncomments TEXT,
Nstart_month INT,
Nstart_day INT,
Nstart_year INT,
Nstart_time VARCHAR(15),
Nend_month INT,
Nend_day INT,
Nend_year INT,
Nend_time VARCHAR(15),
CONSTRAINT PK_News PRIMARY KEY (Nid),
CONSTRAINT FK_News FOREIGN KEY (Uid)
REFERENCES User ON DELETE CASCADE
);
CREATE Table Contact (
Uid INT,
Lname VARCHAR(20) NOT NULL,
Fname VARCHAR(20) NOT NULL,
Email VARCHAR(40) NOT NULL,
Phone VARCHAR(12),
Cellphone VARCHAR(20),
Street VARCHAR(50),
City VARCHAR(20),
State VARCHAR(20),
Zip VARCHAR(10),
CONSTRAINT PK_Contact PRIMARY KEY (Uid, Email),
CONSTRAINT FK_Contact FOREIGN KEY (Uid)
REFERENCES User ON DELETE CASCADE
);