Download Technical Specifications for a Computer Infrastructure to

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Open Database Connectivity wikipedia , lookup

Extensible Storage Engine wikipedia , lookup

Microsoft Jet Database Engine wikipedia , lookup

Tandem Computers wikipedia , lookup

Concurrency control wikipedia , lookup

Database wikipedia , lookup

Functional Database Model wikipedia , lookup

Relational model wikipedia , lookup

Clusterpoint wikipedia , lookup

Database model wikipedia , lookup

Transcript
Technical Specifications for the CUAHSI Hydrologic Information System
Prepared by Darrin Svehlak and David R. Maidment
Center for Research in Water Resources
University of Texas at Austin
John Helly
San Diego Supercomputer Center
University of California, San Diego
March 2005
The following specifications show the computer hardware, operating systems, estimated costs,
application/software packages, etc. needed to implement the CUAHSI Hydrologic Information
System infrastructure as best these are understood at this time. What is being described here is
the definition of a centralized hydrologic information infrastructure that would be set up at one
location on a University campus, or possibly spread among several locations depending on
existing facilities and capabilities of system administrators. This is not the specification of the
client software at the user end, which will vary depending on the extent to which the HIS is used.
For just getting files from the Digital Library or observations data from the Observations
Database, there are no special software requirements for client machines.
There are three parts to the infrastructure: Hydrologic Digital Library, Hydrologic Observations
Database, and Unidata’s Local Data Manager (LDM). There are many different systems and
configurations that could be used to attain a similar infrastructure and these specifications are
only a guideline. Depending on your environment and needs, you could pick and choose pieces
of the infrastructure that work best for you. Everything listed below is for a production
environment. For testing purposes, you would not need the robust servers in this example.
The long term goal is to have a hydrologic information system that can be installed and operated
either on the Windows operating system or on Linux/Unix. At the current point, some
components operate in one operating system and not the other.
Hydrologic Observations Database
The hydrologic observations database is a relational database that stores streamflow, rainfall,
water quality, groundwater and climate data measured at point locations and serves them through
an internet interface. Currently this is implemented using the ESRI products ArcIMS (Internet
Map Service) and ArcSDE (Spatial Database Engine). The data itself is stored in a commercial
relational database (MS SQL/Server, DB2 or Oracle) that has first to be installed on the database
server. Individual data users can make copies of the data into an ESRI Personal Geodatabase,
which is an MS Access file.
Specifications
ArcIMS
ArcSDE
1
1
Single Xeon Processor at
3.06GHz
2 GB
36 GB (RAID 1)
Dual Xeon Processors
at 2.4 GHz
1 GB
36 GB (RAID1) + 270
GB (RAID 5)
36 GB
270 GB
146 GB
** depends on area
analyzed
Example Servers
Dell PowerEdge 2650
Dell PowerEdge 2650
Estimated Costs
for Servers
$3000-$5000
$5000-$7000
Operating Systems
Windows 2000/2003
Server
Windows 2000/2003
Server
ArcIMS 9.0
IIS 5.0
Tomcat 4.1.29
J2SDK 1.4.2
ArcSDE 9.0
SQL Server 2000
1 Gigabit per second
1 Gigabit per second
Network Analyst
(Windows)
Network Analyst
(Windows)
Database Administrator
(DBA)
Research Engineer
(knowledge of GIS,
modeling, etc.)
Workstation
Servers
Specs for
Workstation
Servers
CPU
Memory
Current Hard
Disk Config
Hard Disk
Requirements
(low end)**
Hard Disk
Requirements
(high end)**
Applications/Tools
Packages
Network Transfer
Speed
Technical Skills
Required
Systems Analyst
(VB .NET or Java for
Internet Apps)
For further information on this specification, contact Darrin Svehlak (512) 471-3111
[email protected]
2
Digital Watershed
In addition to the Hydrologic Observations Database, the Digital Watershed also contains a
significant volume of GIS data, weather and climate grids and remote sensing information. This
information can be stored on the same server as contains the relational database and ArcSDE.
Hard disk space will be dependant on the area the watershed. For the Neuse Digital Watershed
example the current requirements are follows: 200GB (remote sensing data) + 100GB (digital
watershed analysis) = 300GB. Also, when real-time observational data is collected, the disk
space requirements increase quickly. For real-time data--in most circumstances--the
geographically selected portions of these files will be extracted, concatenated with previous data
and archived. The rest of the real-time data will be discarded as new data is received.
Hydrologic Digital Library
The Hydrologic Digital Library is a repository of digital files of any character (called Arbitrary
Digital Objects) that are indexed by a metadata catalog which operates in the PostgreSQL
relational database. This system is current implemented in Linux. The size of the external disk
required depends on the size of the collections being housed and could be of the order of 300 –
500GB.
Specifications
1 high-level workstation
Number of Servers/
Workstations
Specs for Servers/
Workstations
CPU
Memory
Current Hard Disk
Config
Estimated Costs for
Servers/
Workstations and
Software
Pentium 4 at 2.0 GHz
1 GB
30 GB + external disk for backup /
extra capacity
$3,000 (hardware) + free (software)
Linux Redhat 9 or OS X (10.3 or
later)
Support an open architecture
Operating Systems
Redhat 9 development tools
(gcc compiler + libraries)
Perl 5
(with DBI and DBD::PgPP)
Java
Apache
Applications/Tools
Packages
3
Broadband or greater (i.e., not
dialup)
Network Transfer Speed
Technical Skills
Required***
System Administrator: 0.25-0.5 FTE
Network Data Manager: 0.5-1.0
FTE
w/ hydrology and IT background
++Range in FTE
is dependant on skill
level
+ data quality assurance
+ data/metadata harvesting
+ network coordination
For more information on this specification, please contact John Helly (858) 534-5060
[email protected]
Unidata’s Local Data Manager
This is a system for receiving real-time streams of weather and climate information, including
Nexrad data from Unidata, an NSF sponsored data center for atmospheric sciences located in
Boulder, CO. For more information about Unidata and their services, see
http://my.unidata.ucar.edu/
Unidata LDM
Specifications
1
Number of Servers/
Workstations
Specs for Servers/
Workstations
CPU
Memory
Current Hard Disk
Config
Hard Disk
Requirements
(low end)**
Dual Xeon Processors at 2.4 GHz
2 GB
30 TB (DataDirect SATA disk
pool)
Hard Disk
Requirements
(high end)**
30 TB (DataDirect SATA disk
pool)
4
Example
Servers/Wrkstns
Dell PowerEdge 2650
Estimated Costs for
Servers/
Workstations
$5,500
++excludes 30 TB disk pool
Operating Systems
Linux Redhat 9
Applications/Tools
Unidata LDM Package
Other Unidata components:
GEMPAK,
Decoders, UD-Units, and
NetCDF
++Real-time data files are
additionally
indexed by SRB (Storage Resource
Broker)
Packages
Gigabit
Network Transfer Speed
Technical Skills
Required***
Network Analyst
(Unix)
Database Administrator (DBA)
Programmer: Java, C, C++, Perl
***Note on technical skills required: This requirement will really depend on the environment.
Smaller environments with fewer resources may have to get by with fewer staff which may mean
support from outside groups when the need arises. Larger environments may have more
resources and, therefore, be able to dedicate the tasks. In all environments, there will need to be
someone equivalent to a network analyst to install and maintain hardware, operating systems,
software components, and networking. In order to run all of the listed infrastructure
components, there would need to be adequate staff to administer both a Windows and Unix
environment. Additionally, if the database is very large, a dedicated database administrator
would probably be beneficial in the long run, but may not be necessary in all situations. Finally,
a systems analyst or programmer would be needed to write code in languages such as VB .NET
and Java. This individual would provide customization (especially of ArcIMS) and the link
between the hydrologic data and the servers.
5