A S Y S T E M P R O T O T Y P E As a proof of concept, a prototype system has been developed and tested successfully for scalable data mining on the top of the vertical database concept. The multi-layered software framewor k approach has been taken to design the prototype. The system is formally named as DataMIME TM (Serazi et al, 2004). The layers of the system include Data Mining Interface (DMI), Data Capture and Data Integration Interface (DCI/DII), Data Mining Algorithm (DMA), and Distributed Ptree Management Interface (DPMI). DMI does counting, the most important operation for data mining provided by P -trees, including basic P-trees, value P-trees, tuple P-trees, interval P-trees, and cube P-trees. DMI also provide the P-tree algebra, which has four operations, AND, OR, NOT (complement) and XOR, to implement the point wise logical operations on P -trees for (Data Mining Algorithms) DMA. DCI/DII allows user to capture and to integrate data to system required format (P-tree format). The DPMI layer provides access, location, and concurrency transparency by hiding the fact that data representation may differ, and resource access protocol may vary, resources may be located in different places, and shared by several competitive users. DMA layer contains a collection of data mining tools, e.g. P-KNN (Khan et al, 2002), PINE (Perrizo et al, 2003), P BAYESIAN (Perera et al, 2002), P -SVM (Pan et al, 2004), and P -ARM (Ding et al, 2002). Besides all those core layers the system provide s a graphical user interface that adds flexible user interaction with the system. In order to comprehend how vertical database concept affects the system, there are some key concepts that must be grasped. Unlike traditional database, data is not stored as horizontal row-based format rather they are stored as compressed vertical P-tree format. The DPMI layer is responsible to store and manage this P -tree based vertical data in the system. The efficient bit -wise operations on vertical data offer the scalability for data mining algorithms and these are achieved through DMI layer. Finally, this uniform efficient vertical data structure at the lowest layer can take advantage of the latest hardware. C O N C L U S I O N Horizontal data structure has been proven to be i nefficient for data mining on very large sets due to the large cost of scanning. It is of importance to develop vertical data structures and algorithms to solve the scalability issue. Various structures have been proposed, among which P -tree is a very promising vertical structure. This database model is not a set of indexes, but is a collection of representations of dataset itself. P-trees have show great performance to process data containing large number of tuples due to the fast logical AND operation without scanning (Ding et al, 2002). In general, horizontal data organization is preferable for transactional data with intended output as a relation, and vertical data structure is more appropriate for data mining on very large data sets.