* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Technical Specifications for a Computer Infrastructure to
Open Database Connectivity wikipedia , lookup
Extensible Storage Engine wikipedia , lookup
Microsoft Jet Database Engine wikipedia , lookup
Tandem Computers wikipedia , lookup
Concurrency control wikipedia , lookup
Functional Database Model wikipedia , lookup
Relational model wikipedia , lookup
Technical Specifications for the CUAHSI Hydrologic Information System Prepared by Darrin Svehlak and David R. Maidment Center for Research in Water Resources University of Texas at Austin John Helly San Diego Supercomputer Center University of California, San Diego March 2005 The following specifications show the computer hardware, operating systems, estimated costs, application/software packages, etc. needed to implement the CUAHSI Hydrologic Information System infrastructure as best these are understood at this time. What is being described here is the definition of a centralized hydrologic information infrastructure that would be set up at one location on a University campus, or possibly spread among several locations depending on existing facilities and capabilities of system administrators. This is not the specification of the client software at the user end, which will vary depending on the extent to which the HIS is used. For just getting files from the Digital Library or observations data from the Observations Database, there are no special software requirements for client machines. There are three parts to the infrastructure: Hydrologic Digital Library, Hydrologic Observations Database, and Unidata’s Local Data Manager (LDM). There are many different systems and configurations that could be used to attain a similar infrastructure and these specifications are only a guideline. Depending on your environment and needs, you could pick and choose pieces of the infrastructure that work best for you. Everything listed below is for a production environment. For testing purposes, you would not need the robust servers in this example. The long term goal is to have a hydrologic information system that can be installed and operated either on the Windows operating system or on Linux/Unix. At the current point, some components operate in one operating system and not the other. Hydrologic Observations Database The hydrologic observations database is a relational database that stores streamflow, rainfall, water quality, groundwater and climate data measured at point locations and serves them through an internet interface. Currently this is implemented using the ESRI products ArcIMS (Internet Map Service) and ArcSDE (Spatial Database Engine). The data itself is stored in a commercial relational database (MS SQL/Server, DB2 or Oracle) that has first to be installed on the database server. Individual data users can make copies of the data into an ESRI Personal Geodatabase, which is an MS Access file. Specifications ArcIMS ArcSDE 1 1 Single Xeon Processor at 3.06GHz 2 GB 36 GB (RAID 1) Dual Xeon Processors at 2.4 GHz 1 GB 36 GB (RAID1) + 270 GB (RAID 5) 36 GB 270 GB 146 GB ** depends on area analyzed Example Servers Dell PowerEdge 2650 Dell PowerEdge 2650 Estimated Costs for Servers $3000-$5000 $5000-$7000 Operating Systems Windows 2000/2003 Server Windows 2000/2003 Server ArcIMS 9.0 IIS 5.0 Tomcat 4.1.29 J2SDK 1.4.2 ArcSDE 9.0 SQL Server 2000 1 Gigabit per second 1 Gigabit per second Network Analyst (Windows) Network Analyst (Windows) Database Administrator (DBA) Research Engineer (knowledge of GIS, modeling, etc.) Workstation Servers Specs for Workstation Servers CPU Memory Current Hard Disk Config Hard Disk Requirements (low end)** Hard Disk Requirements (high end)** Applications/Tools Packages Network Transfer Speed Technical Skills Required Systems Analyst (VB .NET or Java for Internet Apps) For further information on this specification, contact Darrin Svehlak (512) 471-3111 [email protected] 2 Digital Watershed In addition to the Hydrologic Observations Database, the Digital Watershed also contains a significant volume of GIS data, weather and climate grids and remote sensing information. This information can be stored on the same server as contains the relational database and ArcSDE. Hard disk space will be dependant on the area the watershed. For the Neuse Digital Watershed example the current requirements are follows: 200GB (remote sensing data) + 100GB (digital watershed analysis) = 300GB. Also, when real-time observational data is collected, the disk space requirements increase quickly. For real-time data--in most circumstances--the geographically selected portions of these files will be extracted, concatenated with previous data and archived. The rest of the real-time data will be discarded as new data is received. Hydrologic Digital Library The Hydrologic Digital Library is a repository of digital files of any character (called Arbitrary Digital Objects) that are indexed by a metadata catalog which operates in the PostgreSQL relational database. This system is current implemented in Linux. The size of the external disk required depends on the size of the collections being housed and could be of the order of 300 – 500GB. Specifications 1 high-level workstation Number of Servers/ Workstations Specs for Servers/ Workstations CPU Memory Current Hard Disk Config Estimated Costs for Servers/ Workstations and Software Pentium 4 at 2.0 GHz 1 GB 30 GB + external disk for backup / extra capacity $3,000 (hardware) + free (software) Linux Redhat 9 or OS X (10.3 or later) Support an open architecture Operating Systems Redhat 9 development tools (gcc compiler + libraries) Perl 5 (with DBI and DBD::PgPP) Java Apache Applications/Tools Packages 3 Broadband or greater (i.e., not dialup) Network Transfer Speed Technical Skills Required*** System Administrator: 0.25-0.5 FTE Network Data Manager: 0.5-1.0 FTE w/ hydrology and IT background ++Range in FTE is dependant on skill level + data quality assurance + data/metadata harvesting + network coordination For more information on this specification, please contact John Helly (858) 534-5060 [email protected] Unidata’s Local Data Manager This is a system for receiving real-time streams of weather and climate information, including Nexrad data from Unidata, an NSF sponsored data center for atmospheric sciences located in Boulder, CO. For more information about Unidata and their services, see http://my.unidata.ucar.edu/ Unidata LDM Specifications 1 Number of Servers/ Workstations Specs for Servers/ Workstations CPU Memory Current Hard Disk Config Hard Disk Requirements (low end)** Dual Xeon Processors at 2.4 GHz 2 GB 30 TB (DataDirect SATA disk pool) Hard Disk Requirements (high end)** 30 TB (DataDirect SATA disk pool) 4 Example Servers/Wrkstns Dell PowerEdge 2650 Estimated Costs for Servers/ Workstations $5,500 ++excludes 30 TB disk pool Operating Systems Linux Redhat 9 Applications/Tools Unidata LDM Package Other Unidata components: GEMPAK, Decoders, UD-Units, and NetCDF ++Real-time data files are additionally indexed by SRB (Storage Resource Broker) Packages Gigabit Network Transfer Speed Technical Skills Required*** Network Analyst (Unix) Database Administrator (DBA) Programmer: Java, C, C++, Perl ***Note on technical skills required: This requirement will really depend on the environment. Smaller environments with fewer resources may have to get by with fewer staff which may mean support from outside groups when the need arises. Larger environments may have more resources and, therefore, be able to dedicate the tasks. In all environments, there will need to be someone equivalent to a network analyst to install and maintain hardware, operating systems, software components, and networking. In order to run all of the listed infrastructure components, there would need to be adequate staff to administer both a Windows and Unix environment. Additionally, if the database is very large, a dedicated database administrator would probably be beneficial in the long run, but may not be necessary in all situations. Finally, a systems analyst or programmer would be needed to write code in languages such as VB .NET and Java. This individual would provide customization (especially of ArcIMS) and the link between the hydrologic data and the servers. 5