Download LSF - Community Grids Lab

Document related concepts
no text concepts found
Transcript
The Gateway Computational
Web Portal
Marlon Pierce, Choonhan Youn,
Geoffrey Fox
ERDC, August 16 2001
Tutorial Overview
•
•
•
•
•
•
•
•
Demo
Grid and Gateway Overviews
HTML Forms
Java QuickStart Guide
JavaServer Pages Overview
Gateway JSP Tools
WebFlow Module Development
Installation and Security Issues
Computational Grids Survey
A brief introduction to computational
grid projects and goals.
What Is a Computational Grid?
• Grids link distributed scientific resources.
– Resources can be geographically, politically distributed
• Goal: provide means for sharing resources
between organizations.
• Example “high-end” resources:
– Supercomputers and clusters
– Mass storage
– Advanced visualization (CAVES) and collaboration
(Access Grid).
– Particle colliders, telescopes, earthquake detectors
• www.globus.org/research/papers/anatomy.pdf
What Does a Grid Need?
• Multi-institutional security
– PKI or Kerberos
• Information services
– Manage, store, deliver information about resources.
– Use information to make decisions
• Scheduling and Queuing
– Advance reservation
– Meta-queuing
• Remote execution, file transfer, monitoring
Example of a Grid Problem:
CERN’s Large Hadron Collider
• Goes on-line in 2005
• Will generate petabytes of raw, distributed
data, terabytes of event summary data.
• Computing resources for data analysis will
be distributed between CERN and regional
centers spread all over the world
• 1500-2000 people will collaborate on
experiments.
Grid Projects
• Grid Infrastructure
– Condor: www.cs.wisc.edu
– Globus: www.globus.org
– Legion: www.cs.virginia.edu/~legion
• Grid Applications
– Netsolve: www.cs.utk.edu/netsolve
– Ninf: www.etl.go.jp
• Global Grid Forum: www.gridforum.org
Examples of Deployed Grids
• NASA’s Information Power Grid
– Links NASA’s Ames, Glenn, and Langley Centers.
– LaunchPad currently available
– www.ipg.nasa.gov
• DOE’s ASCI Distributed Resource Management
– Links classified computing resources at Lawrence
Livermore, Los Alamos, and Sandia National Labs.
– Full deployment scheduled by Nov 2001.
Latest Grid News
• NSF will spend $53 million on the
Distributed Terascale Facility (DTF)
– 13.6 teraflops, 600 terabytes, 40 Gigabit/sec
– DTF sites: NCSA, SDSC, Argonne, CalTech
– Industry partners: IBM, Intel, Qwest
• See www.ncsa.uiuc.edu/News/Access/Releases
for more information (August 9).
Example: Globus
• Run applications remotely:
– globus-job-run: interactive.
– globus-job-submit: batch for PBS, LSF.
– globusrun: most general version (RSL).
•
•
•
•
Split jobs between hosts.
Send and retrieve data securely (PKI).
Monitor jobs remotely.
Monitor hosts remotely.
What’s the Problem?
• Globus client must be installed on desktop
– Difficult installation
– No ubiquitous access (PDAs, your grandmother’s PC)
• Typical solution is to support Globus at particular
sites and have users remotely log in.
– Problems arise because many users are not Unix-savvy.
• Lots of new commands to learn.
Computational Portals
• Computational portals are designed to
simplify access to grid technologies.
• Also provide coarse-grained grid approach
that ties grid and non-grid resources.
– Not everyone uses Technology X.
– Not everything at a TechX supporting site will
use TechX.
– Different TechX sites may remain separate.
Gateway Architecture
• Gateway is implemented in a three-tiered
architecture.
• Browser Front End
– JSP dynamically generates HTML pages.
• Component Middle Tier
– JavaBeans on the web server.
– Distributed WebFlow servers.
• HPC back end
– Link to grid and non-grid services with rsh, ssh.
– More sophisticated interfaces can be built.
Web Browser
And
Client Applications
HTTP(S)
JavaBean
Local
Service
Web Browser
And
Client Applications
HTTP(S)
Web Server
And
Servlet
Engine
JavaBean
Service
Proxy
JVM
JavaBean
Service
Proxy
JavaBean
Local
Service
WebFlow
Child
Server
WebFlow
Child
Server
SECIOP
SECIOP
WebFlow
Child
Server
WebFlow
Child
Server
Data Storage
WebFlow
Master
Server
HPC+LSF
HPC+PBS
Condor Flock
Globus Grid
Gateway Design Goals
• Build a working portal for users.
• Produce a tool chest for portal developers.
• Targeted Services:
–
–
–
–
–
–
–
File Transfer
Problem organization and session archiving
Batch script generation
Job submission
Job monitoring
Shared visualization
Security
Levels of Use
• Users and Admins can
do everything through
web.
• Portal developers may
want to edit pages, use
our components.
• Advanced developers
can write modules.
Portal Users and
Administrators
Portal
Developers
Module
Developers
Gateway Descriptors
How to add your codes and your
hosts to the portal.
Gateway Descriptors
• Form the base of portal for any particular
field.
• Collect static info about applications, hosts
in an XML data record.
• Application Descriptors describe how to run
codes.
• Host descriptors describe HPC systems.
• Users are described by another mechanism.
Sample Application Descriptor
<XSIL Name="ANSYS" Type="csm.parseXMLDesc">
<Param Name="NumberOfInParams">0</Param>
<Param Name="NumberOfInFiles">1</Param>
<Param Name="NumberOfOutParams">0</Param>
<Param Name="NumberOfOutFiles">1</Param>
<Param Name="IOStyle">StandardIO</Param>
Sample Host Descriptor
<XSIL Name="Modi4" Type="csm.parseXMLHost">
<Param Name="HostName">modi4</Param>
<Param Name="QueueType">LSF</Param>
<Param Name="ExecPath">/usr/bin/ansys57</Param>
<Param Name="WorkDir">/scratch</Param>
<Param Name="QsubPath">/usr/bin/bsub</Param>
Adding Your Application
• We store application and host data in a
single file.
– Applications “contain” hosts.
• You can create and edit this by hand, or
• You can use administrator interface to edit
the data record.
• Admin interface also lets your verify data.
– Did I give the right executable path.
Java Quick Start Guide
A quick and dirty overview of the
Java programming language.
Basic Elements
• The Java language resembles C/C++:
– Primitive types: int,float, double, char, boolean
– Strings are actually classes (more later on this)
– Standard control structures like for and while loops, if
statements, case/switch statements, try/catch blocks.
• Some important differences from C/C++:
– No pointers
– Method arguments are always passed by value.
– No preprocessors or macros.
If/Else Statement Format
if(condition1) {
//conditionally executed code
}
else if(condition2) {
//conditionally executed code
}
else {
//conditionally executed code
}
For Loops
• Syntax:
for(int i=0;i<MAX;i++) {
//executed code
}
• MAX is a variable defined elsewhere.
Java Classes
• Java is object-oriented
– Classes encapsulates data and methods (functions)
within a single entity.
– Objects are instances of classes.
• Analogy: the declaration “int i” creates an instance of an
integer.
• The Java SDK comes with an extensive library of
pre-defined classes for you to use.
• See the online API:
– http://java.sun.com/j2se/1.3/docs/api/
Example Class: Hashtable
• Hashtable allows you to store name/value
pairs.
• To create a new hashtable object:
Hashtable myhash=new Hashtable();
• You can now use Hashtable methods
myhash.put(“MyName”,”Marlon”);
String name=
(String)myhash.get(“MyName”);
References
• The Java web site has API documentation
and tutorials:
– http://java.sun.com/j2se/1.3/docs/
• Excellent reference text:
– “Core Java” Volumes I and II by Cay
Horstmann and Gary Cornell (Prentice Hall)
• O’Reilly publishes the API:
– “Java in a Nutshell” by David Flanagan
Interactive HTML
Using HTML forms to tie widgets to
server actions.
The <form> Tag
• The <form> …</form> tag pair surround all
HTML input types.
• Format:
<form name=“myform” method=“Post”
action=“/GOW/servlet/someAction”>
… <!-- Input tags go here -->
</form>
• The “action” attribute specifies what happens
when an input button is pressed.
– Can be CGI, a servlet, or a JSP page
The <input> Tag
• Input tags define text fields, submit buttons,
radio buttons, menus, …
• Several can be combined within a single
<form>
• Format:
<form method=“GET” action=“servlet/myServlet”>
<input type=“text” name=“text” value=“Sample”>
<input type=“submit”>
</form>
Putting It All Together
<html>
<body>
<form action=“someaction”
method=“Post”>
Please type your name:
<input type=“text”
name=“myname”
value=“Marlon”>
<input type=“submit”>
</form>
</body>
</html>
What Happens When I Click the
Button?
• The CGI Script/Servlet/JSP/… specified in the
action receives all name/value fields of <input>.
• These are sent in the HTTP request to the server.
• Usually you should use “Post” instead of “Get”
– No size limit on requests with Post
– Requests are not shown in the browser location field.
• The server-side code usually returns output to the
browser.
JavaServer Pages (JSP)
Putting Java into HTML to build
dynamic web pages.
What Are JavaServer Pages?
• JSP let you embed Java code into your
HTML web pages.
– Use .jsp extension
• When the page is loaded by the browser,
JSP is translated into a servlet, executed,
and you see the output.
• To run this, you need a special server
– Apache’s Tomcat, IBM’s WebSphere, …
Embedding Scriptlets
<%@ page import=“java.util.Date” %>
<%@ page import=“java.util.Hashtable”%>
<html><body>
Hello, Marlon
<%
Hashtable Myhash=new Hashtable();
Date now=new Date();
Myhash.put(“date”,now);
%>
The time is <%= now.toString() %>
<!-- More HTML and scriptlets to follow. -->
</body></html>
What Does It Mean?
• The “import” statements at the top point to
the location of the Java class files.
• Everything between <% and %> is
interpreted as Java.
– This sections are called scriptlets.
• Java can be in-lined with html using the
<%= %> tags.
– These are referred to as expressions.
Ex: For Loop of Radio Buttons
<%
for (int
i=0;i<5;i++) {
%>
Radio<%= i %>
<input
type=“radio”>
<%
}
%>
Using JavaBeans in JSP
• As presented so far, JSP still requires extensive
knowledge of Java API.
– O’Reilly’s Java API “Nutshell” is 600 pages.
• JavaBeans are custom components that
encapsulate specific sets of functions.
– Develop a small set of classes for area-specific tasks.
• It is good design to separate display from control
code so that each is reusable.
– You don’t want sprawling JSP pages.
Separation of Responsibility
Portal Users
Define
Functionality,
L&F
Web
Developers
Work on L&F
Java
Programmers
Develop Beans
JavaBeans in JSP
• Create an instance “gem” of the gemBean class
<jsp:useBean id=“gem” class=“gem.gemBean”
scope=“session”/>
• You can now use “gem” like any other object
gem.loadData();
gem.runSim();
• By setting the scope, pages can share beans.
• You can also use this tag to initialize once
• Other HTML-like tags exist for accessing data.
Overview of Gateway JSPs
• Welcome Page
– Sets up most beans
– Buttons are included in TrackNavigator.jsp
CodeSelect.jsp
• Codes are read in from the Application
Descriptor file.
• The page is generated automatically from
the descriptor.
• Problem name is mapped to user context
directory, where session data will be stored.
JobSubmit.jsp
• Based on selected code, forms are generated
automatically.
• Application Descriptor file specifies the
number of input files, parameters, etc.
Submitted.jsp
• Shows the generated queue script, based on
user requests.
• The user has one last chance to edit.
• The “Submit” button can be tied to an
action to run the script.
ReturnPage.jsp
• Job has been submitted.
• The track navigator is again included at the
bottom of the page.
Gateway Bean Classes
An overview of the Bean classes that
can be used to build portals.
Gateway Architecture
• We have developed a
number of service
beans for
computational portals.
• Some accomplish
specific tasks on
server.
• Others act as proxies
to WebFlow modules
(next section).
Context Data
• Gateway organizes user sessions into
“problems” and “sessions.” A problem
contains one or more sessions.
• All of this is called Context data. It maps to
a directory on the server.
• All information gathered from the user is
stored as name value pairs in the
appropriate subdirectory.
ContextManagerBean
• Contains convenience methods for finding
old problems and sessions, creating new
ones, deleting old ones, etc.
• Common Methods: too many to list. Come
to the lab or see the documentation
– www.gatewayportal.org/DOC/index.html
moduleServerBean
• Hides the messy details of connecting to
WebFlow and getting an instance of the
module you want.
• Creates instances of all WebFlow modules,
provides accessor methods for them.
• So to get the submitJob module, I just use
submitJob sj=modserver.getSubmitJob();
in my JSP page.
parseXMLBean
• Parses the Application Descriptor data
record.
• Provides specific getters for hosts,
applications.
• Provides general getters for other
parameters:
– getCodeTagValue(“ANSYS”,”IOStyle”);
createScript
• This is an abstract
superclass of script
generators.
– Extend it with
createPBS.java,
createLSF.java,
createCSH.java, etc.
• Actual class created at
runtime with
scriptFactory.java.
setPropBean
• JSPs communicate by
sending HTTP requests to
each other.
• We many name/value
pairs to write to the
Context data directory.
• setPropBean provides
automating methods to
remove drudgery and cut
out page bulk.
• Other JSPs can recover
data using
ContextManager.
JSP
HTTP Request
setPropBean
ContextManager
Context
Data
Miscellaneous Beans
• jobInfoBean: convenient wrap around
hashtable for storing name/value strings.
• nameEncodeBean: inserts/removes
underscores in problem names. Used to
create unix directory names.
• GetFileBean: reads/writes script files to
disk, filters out control characters.
Page Control
• Page flow is controlled by
the servlet
GOWAdminServlet.java.
– Pages call this servlet,
which invokes the next
page.
• The servlet receives the
request from page A,
looks up the next page is
display, and shows it.
Commands
• Commands are classes that implement a
simple “Command” interface.
– Must override the execute() method.
• ForwardCommand: Simplest case. Just
forwards control to the specified page.
• SubmitCommand: Assembles and executes
a remote command to run a job before
displaying the next page.
WebFlow Modules
An overview of how to use existing
modules and how to write your own.
The Role of WebFlow
• WebFlow servers can distribute portal
services over many hosts.
• WebFlow can do this because it is
hierarchical:
– Single parent acts as gatekeeper for child
servers.
– Ex: Run main server at FSU, child server at
NCSA to provide access to remote file system.
WebFlow Design
• WebFlow is a custom-built
component system.
– Implements JavaBeans spec
using CORBA
• Servers contain “contexts”
(abstract containers) and
“modules”.
– Contexts are organizational,
can be remote (i.e. child
servers).
– Modules are CORBA
implementation files.
Configuring WebFlow Servers
• WebFlow servers configured with text files.
• Header:
–
–
–
–
Name of server
File to write IOR if it is a master server
Parent
URL of IOR file (if child).
• List of provided modules follows:
– Name
– Location of interface (IDL or XML)
– Java package name of module
Some Standard Modules
• submitJob: executes external local and
remote commands (rsh, ssh), moves files to
and from remote systems (rcp, scp).
• remotefile: moves files between client and
server machines.
• ContextManager: can manage remote
contexts. Uses two helper modules.
• Charon: http security module.
Using Modules
• Modules are just Java classes.
– API on the web at
www.gatewayportal.org/DOC/index.html.
• Get instance in JSP page using
moduleServerBean.
– You can now invoke the object’s methods on
the remote server as if they were local.
Developing Modules
• Develop IDL interface (list of methods)
• Must compile IDL with Orbacus’s jidl. Generates
CORBA stubs, skeletons.
• Write a Java implementation file
– Defines methods of the interface.
• Compile it all.
• Add it to the appropriate server’s configuration
file.
• Modify moduleServerBean to make it available to
the JSP pages.
IDL Boilerplate
#ifndef _WEBFLOW_
#include "../BC.idl“
#endif
module WebFlow{
module myModule {
interface myModule:BeanContextChild {
#Insert your methods here.
void test();
string execCommand(in string command);
…
};
};
};
Implementation File BoilerPlate
package WebFlow.myModule;
public class myModuleImpl extends WebFlow.BeanContextChildSupport
implements myModuleOperations {
String msg_;
org.omg.CORBA.Object peer;
public myModuleImpl(org.omg.CORBA.Object peer,
String msg) throws WebFlow.NullPointerException {
super(peer);
this.peer=peer;
String msg_=msg;
}
//Your method definitions go here.
}
Web Portal Security
A review of some security issues and
some minimal recommendations.
Multi-tiered Security
• Multiple tiers require
security between, within
each tier.
• Security issues:
– Authentication
– Authorization
– Privacy
• Implementing these endto-end is a challenge.
Some Minimal Security
Suggestions
• Use SSL-enabled Apache web server.
• Disable remote access to Tomcat.
• Use multiple authentication methods
– HTTP Authentication
– Client certificates
• Use ssh or kerberized rsh, not plain rsh.
• Put on test bed first, log all usage.
Next Steps
• Add meta-job descriptiors to provide better
links between HPC and visualization.
• Improved 3D graphics for remote
visualization.
• Component interfaces to Condor and
Globus.
– Globus CoG kits are available for Java.
– GPDK provides Bean bridge to CoG.
Some Resources
• Gateway web site: www.gatewayportal.org.
– All materials and software can be downloaded
from here.
• Grid Computing Environments:
www.computingportals.org.
• My contact info:
– Email: [email protected]
– Phone: (937)904-5140
Coda: Topics for Lab Session
Hands-on activities for Thursday’s
lab.
Lab Topics
• Installing and configuring Tomcat.
• Installing and configuring Apache with
SSL.
• Downloading, configuring, and running
WebFlow.
• Modifying GEM sample portal JSP pages.
Web Server
And
Servlet Engine
Browser
HTTP
HTTP(S)
Charon
Module
Charon Client
SECIOP
WebFlow Server
Desktop Client
Remote Server
JavaServer
Page
JavaServer
Page
Request
Response
Administrative
Servlet
Forward
Command
Submit
Command
Command
Interface
JavaServer
Page
Script Generator
Superclass
Script
Factory
PBS
Script
Generator
LSF
Script
Generator
GRD
Script
Generator
PBS
Script
LSF
Script
GRD
Script
WebFlow
Parent Server
A
B
User Contexts
Proxy Images
Child Server A
Child Server B
Modules