Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
CGI Programming: Part 1
Robert M. Dondero, Ph.D.
Princeton University
1
Objectives
You will learn about:
Common Gateway Interface (CGI) programming
GET method
POST method
In Python and Java
2
Motivation for CGI Programming
Problem:
Often static web pages are insufficient
Often web server must generate web pages
dynamically
Based upon parameters supplied by browser
Using data fetched from a database
One solution:
Common Gateway Interface (CGI) protocol
3
URLs Revisited
URL format:
protocol://host:port/file.cgi?
name1=value1&name2=value2&...
4
URL Encoding
Names and values are URL encoded
Each special character (", ', =, &, etc.) is encoded
as %nn (hex)
Each space is encoded as +
Many standard libraries provide
functions/methods to encode and decode...
5
URL Encoding in Python/Java
See urlencode.py
See UrlEncode.java
Try command line arguments:
one
't wo'
'th" ree'
'~!@#$%^&*()_+'
6
HTML Links (Revisited)
<a href="http://host:port/file.cgi?name1=value1&name2=value2">
Same as previously described, except...
Browser requests that the program named file.cgi
be executed
Web server passes name/value pair(s) to the
program
7
HTML Forms (Revisited)
Or could be "post";
see upcoming slides
<form action="http://host:port/file.cgi" method="get">
<input type="text" name="name1" value="value1"><br>
<input type="text" name="name2" value="value2"><br>
<input type="submit">
</form>
Same as previously described, except...
Browser requests that the program named file.cgi
be executed
Web server passes name/value pair(s) to the
program
8
CGI Details: GET Method
Browser
Socket
Web Server
Or could be POST;
see upcoming slides
GET file.cgi?name1=value1&name2=value2 HTTP/1.1
Host: host
<Blank line>
Environment variable
QUERY_STRING: name1=value1&name2=value2
fork/exec
Pipe
(usually)
9
CGI Details: GET Method
Uses QUERY_STRING
Writes to stdout
file.cgi
wait
Pipe
(usually)
Content-Type: text/html
<Blank line>
<HTML page>
database
somefile
Web Server
Socket
Browser
HTTP/1.1 200 OK
Date: date
Server: server
…
Content-Type: text/html
<Blank line>
<HTML page>
There are
many others
10
"Hello Get" Application
Browser
Sends HTTP request
Contains "person=handle" via GET method
Web server and CGI program
Retrieve "person=handle"
Compose and send HTTP response
"hello, handle"
11
"Hello Get" in Python
Browse to this URL:
http://www.cs.princeton.edu/~rdondero/cos333/HelloGetPython/index.html
Fill in form
Click on Submit button
Refresh the page
View → Page Source
12
"Hello Get" in Python
See HelloGetPython Application
index.html
Convenience for user
hello.cgi
Fetches "person=handle" from QUERY_STRING
env var
Composes response page
13
"Hello Get" in Python
Problem:
Solution:
Python interpreter used by CS Apache Web server
is old, and is missing MySQLdb module
Hard-code Python interpreter location when using
CS Apache Web server
And incidentally...
For portability, on my local computer I defined
/usr/local/bin/python as an alias for /usr/bin/python
14
"Hello Get" in Java
Browse to this URL:
http://www.cs.princeton.edu/~rdondero/cos333/HelloGetJava/index.html
Fill in form
Click on Submit button
Refresh the page
View → Page Source
15
"Hello Get" in Java
See HelloGetJava Application
index.html
hello.cgi
Convenience for user
Calls "java Hello"
Hello.java
Fetches "person=handle" from QUERY_STRING
env var
Composes response page
16
"Hello Get" in Java
Problem:
URL cannot specify execution of Java interpreter
Solution:
Web server executes Bash "helper" script
Bash helper script executes Java interpreter
17
"Hello Get" in Java
Problem:
Default Java interpreter used by CS Dept Apache
Web server is old
Solution:
Set PATH in .cgi script
18
CGI Details: POST Method
Browser
Socket
POST file.cgi HTTP/1.0
Host: host
Content-type: application/x-www-form-urlencoded
Content-length: length
<Blank line>
name1=value1&name2=value2
Web Server
stdin: name1=value1&name2=value2
fork/exec
Pipe
(usually)
19
CGI Details: POST Method
Reads from stdin
Writes to stdout
file.cgi
wait
Pipe
(usually)
Content-Type: text/html
<Blank line>
<HTML page>
database
somefile
Web Server
Socket
Browser
HTTP/1.1 200 OK
Date: date
Server: server
…
Content-Type: text/html
<Blank line>
<HTML page>
There are
many others
20
"Hello Post" Application
Browser
Sends HTTP request
Contains "person=handle" via POST method
Web server and CGI program
Retrieve "person=handle"
Compose and send HTTP response
"hello, handle"
21
"Hello Post" in Python
Browse to this URL:
http://www.cs.princeton.edu/~rdondero/cos333/HelloPostPython/index.html
Fill in form
Click on Submit button
Refresh the page
View → Page Source
22
"Hello Post" in Python
See HelloPostPython Application
index.html
Convenience for user
hello.cgi
Fetches "person=handle" from stdin
Composes response page
23
"Hello Post" in Java
Browse to this URL:
http://www.cs.princeton.edu/~rdondero/cos333/HelloPostJava/index.html
Fill in form
Click on Submit button
Refresh the page
View → Page Source
24
"Hello Post" in Java
See HelloPostJava Application
index.html
hello.cgi
Convenience for user
Calls "java Hello"
Hello.java
Fetches "person=handle" from stdin
Composes response page
25
"FancyForm" Application
Illustrates fancy links and forms
Browse to this URL:
http://www.cs.princeton.edu/~rdondero/cos333/FancyForm/index.html
Click on anchor
Fill in forms
Click on Submit buttons
26
"FancyForm" in Python
See FancyForm application
index.html
fancyform.cgi
input tag
Type text, password, radio, checkbox, hidden,
reset, submit
textarea tag
select and option tags
27
"FancyForm" in Java
Omitted
(Nothing new)
28
GET vs. POST
GET method
"name=value" pairs passed in request header
POST method
"name=value" pairs passed in request body
Same power
When to use which?
29
GET vs. POST Technical Criteria
Use POST when:
There are very many "name=value" pairs, and/or...
Some "name=value" pairs are very long
Some browsers have a max URL length
E.g., MS Internet Explorer max URL length is
(was?) 2083 chars
"name=value" pairs contain non-ASCII chars
30
GET vs. POST Convention
Use GET iff request is idempotent
The request doesn't change server-side state
Processing the same "name=value" pairs twice has
same effect as processing them once
Use POST iff request is not idempotent
The request does change server-side state
Processing the same "name=value" pairs twice had
different effect from processing them once
31
GET vs. POST Convention
With that convention...
Browser sees POST request =>
Browser assumes request is not idempotent
Browser warns about page refresh
Browser sees GET request =>
Browser assume request is idempotent
Browser does not warn about page refresh
32
GET vs. POST Examples
Example: Web page asks for the price of a car
Form should use GET
Browser:
GET => idempotent
Refresh page => no problem
Example: Web page purchases a car
Form should use POST
Browser:
POST => not idempotent
Refresh page => generate warning
33
GET vs. POST Recommendation
Generally:
Use GET to query server-side data
And maybe for learning
Use POST to change server-side data
34
CGI App Deployment
Using the CS Apache web server:
On penguins:
u
loginid
public_html
cos333
AppName
page1.html
page2.cgi
...
Remember to set
file permissions to
world readable (and
executable)
In browser:
http://www.cs.princeton.edu/~loginid/cos333/AppName/page1.html
http://www.cs.princeton.edu/~loginid/cos333/AppName/page2.cgi?
name1=value1&name2=value2&...
35
CGI App Debugging
Apache Web server keeps error log
Contains Web app failure reports (sometimes)
Contains text that Web app writes to stderr
Using CS Dept Apache Web server:
Can't view error log directly, but...
Most recent messages available via:
https://csguide.cs.princeton.edu/publishing/errorlogs
36
A Running Example
Browser
Pennypack Applications
Author name prefix
Web Server
Author name prefix
CGI Program
database
Author names, book titles, book prices
Web Server
Author names, book titles, book prices
Browser
37
PennypackPython1 Application
See PennypackPython1 application
book.py
database.py
common.py
index.html
searchform.cgi
searchresults.cgi
38
PennypackPython1 Application
Problem:
Location of DB server varies
CS Dept: publicdb.cs.princeton.edu
My local computer: 127.0.0.1 or localhost
Solution:
Default to publicdb.cs.princeton.edu
Define a DB_SERVER_HOST environment variable
in local Web server config file to override
39
Toward PennypackPython2
Problem:
Missing author name causes searchresults.cgi to
crash!
Solution:
Refactor and enhance...
40
PennypackPython2 Application
See PennypackPython2 application
book.py (same)
database.py (same)
common.py (same)
index.html (same)
searchform.cgi (different)
searchform.py (new)
searchresults.cgi (different)
searchresults.py (new)
41
PennypackJava1 Application
See PennypackJava1 application
Book.java
Database.java
index.html
searchform.cgi
SearchForm.java
searchresults.cgi
SearchResults.java
Common.java
Cgi.java
42
PennypackJava1 Application
Problem:
JVM launch is slow, and so...
Java is seldom used for CGI programming, and
so...
Java standard library has no CGI-handling classes
Solution:
I wrote one that we can use: Cgi.java
43
PennypackJava1 Application
Problem:
Location of MySql driver must be in CLASSPATH
Solution:
Set CLASSPATH in searchresults.cgi
44
PennypackJava1 Application
Problem:
Location of DB server varies
Solution:
Same as with Python
45
Toward PennypackJava2
Problem:
Missing author name is not handled as I wish
Solution:
Refactor and enhance...
46
PennypackJava2 Application
See PennypackJava2 application
Book.java (same)
Database.java (same)
index.html (same)
searchform.cgi (same)
SearchForm.java (different)
searchresults.cgi (same)
SearchResults.java (different)
Common.java (same)
Cgi.java (same)
47
Summary
We have covered:
Common Gateway Interface (CGI) programming
GET method
POST method
In Python and Java
48