Download input-validation-cs2-java-answers-200907241031

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Input Validation – “All input is evil”
CS2
Background
Summary: Any input that comes into a program from an external source – such as a user
typing at a keyboard or a network connection – can potentially be the source of security
concerns and potentially disastrous bugs. All input should be treated as potentially dangerous
Description: All interesting software packages rely upon external input. Although information
typed at a computer might be the most familiar, networks and external devices can also send
data to a program. Generally, this data will be of a specific type: for example, a user interface
that requests a person’s name might be written to expect a series of alphabetic characters. If
the correct type and form of data is provided, the program might work fine. However, if
programs are not carefully written, attackers can construct inputs that can cause malicious
code to be executed.
Risk – How can It happen? Any data that can enter your program from an external source
can be a potential source of problems. If external data is not checked to verify that it has the
right type of information, the right amount of information, and the right structure of information,
it can cause problems.
Input validation errors can lead to buffer overflows if the data being provided is used as an
index into an array. Input that is used as the basis for a database search can be used as the
basis for SQL injections, which use carefully constructed inputs to make relational databases
reveal data inappropriately or even destroy data.
Example of Occurrence: A Norwegian woman mistyped her account number on an internet
banking system. Instead of typing her 11-digit account number, she accidentally typed an
extra digit, for a total of 12 numbers. The system discarded the extra digit, and transferred
$100,000 to the (incorrect) account given by the 11 remaining numbers. A simple dialog box
informing her that she had typed two many digits would have gone a long way towards
avoiding this expensive error.
Olsen, Kai. “The $100,000 Keying error” IEEE Computer, August 2008
Example in Code: This program stores the squares of the number from one to ten in array,
and then asks the user to type a number. The square of that number will then be returned:
import java.util.Scanner;
public class InputValidationExample {
public static void main(String[] args) {
int[] vals = new int[10];
for (int i = 0; i < 10; i++) {
vals[i] = (i+1)*(i+1);
}
System.out.print("Please type a number: ");
Scanner sc = new Scanner(System.in);
int which = sc.nextInt();
int square = vals[which-1];
System.out.println("The square of "+which+" is "+square);
}
}
This program has two input validation problems. The first comes with the use of the scanner
to read an integer from the console: int which = sc.nextInt(). If the user types a number,
this will work just fine. However, if the user types something that is not a number, a
NumberFormatException will be thrown. A robust program would catch this error, provide a
clear and appropriate error message, and ask the person to re-type their input.
The second problem occurs when the array is accessed. Even if the user provides an
appropriate integer, the value may be out of the range of the array. A java array containing 10
elements can only be accessed by indices 0,1,...,9. Thus, the only values of which that will
work correctly are 1,2,...,10. Any values outside of this range will lead to an attempt to access
a value outside the range of the array. In Java, this will lead to an exception. In other
languages, this may lead to a buffer overflow that might be exploited by malicious software.
How can I avoid input validation problems?
Check your input: The basic rule is for input validation is to check that input data matches all
of the constraints that it must meet to be used correctly in the given circumstance. In many
cases, this can be very difficult: confirming that a set of digits is, in fact, a telephone number
may require consideration of the many differing phone number formats used by countries
around the world. Some of the checks that you might want to use include:
 Type: Input data should be of the right type. Names should generally be alphabetic,
numbers numeric. Punctuation and other uncommon characters are particularly
troubling, as they can often be used to form the basis of code-injection attacks.
Many programs will handle input data by assuming that all input is of string form,
verifying that the string contains appropriate characters, and then converting the
string into the desired data type.
 Range: Verify that numbers are within a range of possible values: For example,
the month of a person's date of birth should lie between 1 and 12. Another
common range check involves values that may lead to division by zero errors.
 Plausibility: Check that values make sense: a person's age shouldn't be less than 0
or more than 150.
 Presence check: Guarantee presence of important data – the omission of important
data can be seen as an input validation error.
 Length: Input that is either too long or too short will not be legitimate. Phone
numbers generally don't have 39 digits; Social Security Numbers have exactly 9
 Format: Dates, credit card numbers, and other data types have limitations on the
number of digits and any other characters used for separation. For example, dates
are usually specified by 2 digits for the month, one or two for the day, and either two
or four for the year.
 Checksums: Identification numbers such as bank accounts, often have check
digits: additional digits included at the end of a number to provide a verifiability
check. The check digit is determined by a calculation based on the remaining digits
– if the check digit does not match the results of the calculation,either the ID is bad
or the check digit is bad. In either case, the number should be rejected as invalid.
Use appropriate language tools: The safety of tools that read user input varies across
programming languages and systems. Some languages, such as C and C++ have library
calls that read user input into a character buffer without checking the bounds of that buffer,
causing a both a buffer overflow and an input validation problem. Alternative libraries
specifically designed with security in mind are often more robust.
The choice of programming languages can play a role in the potential severity of input
validation vulnerabilities. As strongly-typed languages, Java and C++ require that the type of
data stored in a variable is known ahead of time. This requirement leads to the type
mismatch problem when – for example- a string such as “abcd” is typed in response to a
request for an integer. Untyped languages such as Perl and Ruby do not have any such
requirements – any variable can store any type of value. Of course, these languages do not
eliminate validation problems – you may still run into trouble if you use a string to retrieve an
item from an integer- indexed array. Some languages provide additional help in the form of
built-in procedures that can be used to remove potentially damaging characters from input
strings.
Recover Appropriately: A robust program will respond to invalid input in a manner that is
appropriate, correct, and secure. For user input, this will often mean providing an informative
error message and requesting re-entry of the data. Invalid input from other sources – such as
a network connection – may require alternate measures. Arbitrary decisions such as
truncating or otherwise reformatting data to “make it fit” should be avoided.
Laboratory/Homework Assignment:
Consider this program:
import java.util.*;
public class Input {
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
int sz = getArraySize(scan);
String[] names = getNames(scan,sz);
int which = getWhich(scan);
String aName = getName(which,names);
System.out.println("You choose name: "+aName);
}
public static int getArraySize(Scanner scan) {
System.out.print("How many names? ");
int n = scan.nextInt(); V not checked for type , length, format, or reasonableness.
scan.nextLine();
return n;
}
public static String[] getNames(Scanner scan, int sz) {
String[] names = new String[sz];
for (int i = 0; i < sz; i++ ){
System.out.print("type name # "+(i+1)+": ");
names[i] = scan.nextLine();V – not checked for type, length, format, or
reasonableness
}
return names;
}
public static int getWhich(Scanner scan) {
System.out.print("Which name: ");
int x = scan.nextInt();V – not checked for type, length, format, or reasonablness.
return x;
}
public static String getName(int n,String[] vals) {
return vals[n-1];
}
}
1. Complete the following checklist for this program.
See above
2. List the potential input validation errors.
The index used returned by getWhich is not validated, Also, if a non-integer value is typed for
prompts in getArraySize() and getWhich(), an exception will be thrown. The value returned by
getArraySize() is not checked for reasonableness – it might be absurdly large.
3. Provide example inputs that might cause validation problems, and describe the
problems that they might cause.
If the number typed for getWhich() is greater than the number provided for getArraySize(), or it
is less than zero the value passed to getName() will be out of bounds, and an
ArrayIndexException will be thrown.
See next question as well.
4. What happens if you type non-numeric characters for either the number of names or
which name you wanted to retrieve?
An Exception will be thrown.
5. Revise the program to properly validate input and gracefully recover from errors.
import java.util.*;
public class Input2 {
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
int sz = getArraySize(scan);
String[] names = getNames(scan,sz);
int which = getWhich(scan,names.length);
String aName = getName(which,names);
System.out.println("You choose name: "+aName);
}
public static int getArraySize(Scanner scan) {
int n = -1;
while ( n < 0) {
try {
System.out.print("How many names? ");
n = scan.nextInt();
scan.nextLine();
} catch(InputMismatchException e) {
System.out.println("Please type an integer");
scan.nextLine();
}
}
return n;
}
public static String[] getNames(Scanner scan, int sz) {
String[] names = new String[sz];
for (int i = 0; i < sz; i++ ){
System.out.print("type name # "+(i+1)+": ");
names[i] = scan.nextLine();
}
return names;
}
public static int getWhich(Scanner scan,int length) {
int x = -1;
while ( x <1 || x >length) {
try {
System.out.print("Which name: ");
x = scan.nextInt();
scan.nextLine();
}
catch(InputMismatchException e) {
System.out.println("Please type an integer value");
scan.nextLine();
}
}
return x;
}
public static String getName(int n,String[] vals) {
if (n >=1 && n <= vals.length) {
return vals[n-1];
}
else {
return "";
}
}
}
The original program did not validate the name for format or length, and neither does this version. Validating names is
extremely difficult, as puncuation characters, digits, and other non-alphabetic characters might appear.
6. You’re writing a program that will be used to submit bid on items from an online auction
site. Each item is available in multiple lots – for example, there might be 100 boxes of
crayons available. Your program must ask users for two important pieces of
information:
1. The price that they are willing to pay for the item, given in dollars and cents.
2. The quantity of that item that they want to bid on. This quantity must be at least
one, and it must be a whole number – it’s not possible to buy fractional parts of an
item.
Your program should validate the user input for both quantities. To do so, take the
following steps:
 Price of items
1. Ask the user to type the number of items
2. Read this value into a string
3. Write a routine that will examine the string to verify that it contains a number that
can be a legal amount of money. This string must contain:
 Some number of integers (possibly zero) – the number of dollars
 An optional decimal point
 Some number of integers (possibly zero) - the number of cents.
 There must be at least one digit in the number
Thus, “12.34”, “12”, and “.34”, are all valid prices, but “12,34” and ‘.” are not.
To examine all of the characters in the string, you can the charAt(i) method
from the string class. You can check each character in the string to see that it is
either a decimal or an integer. You should also check to make sure that there
are not multiple decimal points.
This routine should return a boolean value that is true if the string is a legitimate
price and false otherwise.
4. If the value provided is not legitimate, print an error message.
5. If the value is legitimate, convert it into a float by using
Float f = Float.parseFloat(s) –
if s is the string that the user typed.

Number of items: This will be similar to the price, but you must simply check for
whole numbers – no decimals allowed. The quantity must be greater than or equal
to one. You will convert the result to an int (using Integer.parseInt(s) ), not a
float.

An interactive loop: put these two checks in a loop that will ask for both values,
check both to see if they are valid, and then repeat requests for any values that are
not valid.
The easiest way to do this is probably to have two boolean values , one for each
input quantity. These values will represent the validity of the input quantities. You
will stay in an input loop as long as at least one of them is false. When this is the
case, you will then check the values to see which is false (i.e., invalid) . If a value is
not valid, you will repeat the prompt and validate the response. This will continue
until both are valid.
import java.util.Scanner;
public class Auction {
public static void main(String[] args) {
Scanner scan = new Scanner(System.in);
boolean priceValid = false;
boolean quantValid = false;
String priceString="";
String quantString="";
float price=0.0f;
int quant = 0;
while (priceValid == false || quantValid == false) {
if (priceValid == false) {
priceString = getPriceString(scan);
priceValid = validatePrice(priceString);
if (priceValid == true) {
price = Float.parseFloat(priceString);
}
}
if (quantValid == false) {
quantString = getQuantString(scan);
quantValid = validateQuant(quantString);
if (quantValid == true) {
quant = Integer.parseInt(quantString);
}
}
if (priceValid == false) {
System.out.println("Not a valid price. Prices must be of the form
\"12.45\"\n");
}
if (quantValid == false) {
System.out.println("Not a valid quantity. Quantity must be a whole number
greater than zero.");
}
}
printConfirmation(quant,price);
}
public static String getPriceString(Scanner scan) {
return getStringValue("Please type the price that you are willing to pay:
",scan);
}
public static String getQuantString(Scanner scan) {
return getStringValue("Please indicate the number of items that you wish to
purchase: ",scan);
}
public static String getStringValue(String msg,Scanner scan) {
System.out.print(msg);
String value = scan.next();
return value;
}
public static boolean validatePrice(String price) {
int decimalCount = 0;
boolean seenDigit = false;
for (int i=0; i < price.length(); i++) {
char c = price.charAt(i);
// check to see if we've hit a period. if we've hit a second
// period, no good.
if (c == '.') {
decimalCount++;
if (decimalCount == 2) {
return false;
}
}
/// otherwise, if it's not number, return false
else {
if (isNum(c) == false) {
return false;
}
else { // this is a digit
seenDigit = true;
}
}
}
// if we get to the end of the string without seeing a digit, it's no good
if (seenDigit == true) {
return true;
}
else {
return false;
}
}
public static boolean isNum(char c) {
return (c >='0' && c <='9');
}
public static boolean validateQuant(String q) {
for (int i = 0; i < q.length(); i++) {
char c = q.charAt(i);
if (isNum(c) == false) {
return false;
}
}
// all digits. check to see >0
int quan = Integer.parseInt(q);
if (quan<=0) {
return false;
}
else {
return true;
}
}
public static void printConfirmation(int q,float p) {
String item="item";
if (q > 1) {
item +="s";
}
String msg = "You are buying "+q+" "+item+" at $"+p;
if (q > 1) {
msg +=" each";
}
System.out.println(msg);
}
}
Security Checklist:
Security Checklist
Vulnerability
Input Validation
Course
Task – Check each line of code
1. Mark each variable that receives external input with a V
CS2
Completed
For each statement that is marked with a V, verify that the variable is
checked for each of these criteria. Note any that is not checked for
3. Length
4. Range (reasonableness?)
5. Format
6. Type
Shaded areas indicate vulnerabilities!
Discussion Questions:
1. You're writing a program that asks the user to type in a telephone number. How might
you validate that the characters that they've typed represent a legal telephone
number? You should assume that you're only concerned about phone numbers from
the US, but you want to give users as much flexibility as possible, in terms of spaces
and punctuation characters. List some rules that you might use. Make sure that you
complete this question before moving on to question #2.
1. Verify that there are 10 digits
2. Remove any parentheses, dashes, or spaces
2. Find an example of a phone number that doesn't fit your rules.
Anything that requires a leading 1 – as in 1 410 555 1212
Any number specified with a “+” at the beginning: +1 410 555 1212
3. Describe either an example of an input validation problem that you may have
encountered. If you can't remember having any sort of problem, try some web pages or
other software tools – try to find a system that fails to validate input data correctly.
Taking zip codes without verifying 5 digits, accepting dates that have already passed, improper
formats for phone numbers, etc.
4. If input is sufficiently cryptic, it might be hard to provide useful error messages in
responses to invalid input. Describe some strategies that might be used to help users
recover from invalid input.
Example formats indicating what correct inputs might look like, error messages that describe
difficulties with input as provided. Flexible inputs that allow users to correct multiple errors with
one screen – as opposed to fixing them one at a time.
5. Revisit Programming Exercise 6. Are there any inputs that the above description
accepts as valid that perhaps should be considered invalid? If so, what are they and
how might you handle them?
The price can accept value with three numbers after the decimal point, such as 12.345. This
should probably be modified so that the number of cents is required to be whole.