Download Probabilistic Text Generation with HashMap and java.util

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Probabilistic Text Generation with HashMap<K, V> and java.util.ArrayList<E>
You are asked to complete class RandomWriterWithHashMap.java that provides a random writing application, It must use
java.util.HashMap<K, V> to store all possible nGrams (short strings) as keys mapped to the list of all possible
following characters as the value.
Your program (file is to be turned into the D2L dropbox named RandomWrite. While testing your code, use any very
small input file you want or an entire book from Project Gutenberg. You must use java.util.HashMap<K, V> class,
java.util.ArrayList, and the following algorithm:
1. (This is done) Read all file input into one big string (already done in RandomWriterWithHashMap.java, actually a
StringBuilder object to save time--StringBuilder has all the methods of String plus a fast append method)
2. Complete method setUpMap() so it initializes the HashMap instance variable all to have every possible
nGram of the given nGram length as the key and an ArrayList<Character> as the value for each mapping.
These ArrayList values must contain all characters that follow each key in the given text input.
3. Done: Pick a random nGram from the original text. This is known as nGram
4. Complete method printRandom(int howMany): so it prints howMany characters with this algorithm
 Get the list of all the characters that follow the nGram (the instance variable nGram is set for you
initially) from all (the HashMap instance variable you built in setUpMap.
 Randomly select one of the characters that list of followers
 Print that random character
 Change the nGram so the first character is gone and the just printed random character is appended
(nGram must be the same length at the end)
Start with the following program and complete methods setUpMap and printRandom
// This program generates 500 characters of probabilistic text.
// When the seedLength is small, as in 2 or 3, the text should
// be very random. When the nGram length is 5 words should appear.
// At 14 or more, you are probably getting exact sentences from the book.
//
// Warning: A big input file and a large nGram length can run your computer out of memory.
//
// @author YOUR NAME
//
import java.io.File;
import java.io.FileNotFoundException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Random;
import java.util.Scanner;
public class RandomWriterWithMap {
public static void main(String[] args) {
Scanner keyboard = new Scanner(System.in);
System.out.print("Enter the file name to random write: ");
String fileName = keyboard.nextLine();
System.out
.print("Enter nGram length, 1 is like random, 12 is like the book: ");
int nGramLength = keyboard.nextInt();
keyboard.close();
RandomWriterWithMap rw = new RandomWriterWithMap(fileName, nGramLength);
rw.printRandom(500);
}
private
private
private
private
private
private
HashMap<String, ArrayList<Character>> all;
int nGramLength;
String fileName;
StringBuilder theText;
static Random generator;
String nGram;
public RandomWriterWithMap(String fileName, int nGramLength) {
this.fileName = fileName;
this.nGramLength = nGramLength;
generator = new Random();
makeTheText();
setRandomNGram();
setUpMap(); // Algorithm considered during section.
}
private void makeTheText() {
Scanner inFile = null;
try {
inFile = new Scanner(new File(fileName));
} catch (FileNotFoundException e) {
e.printStackTrace();
}
theText = new StringBuilder();
while (inFile.hasNextLine()) {
theText = theText.append(inFile.nextLine().trim());
theText = theText.append(' ');
}
}
public void setRandomNGram() {
generator = new Random();
int start = generator.nextInt(theText.length() – nGramLength - 1);
nGram = theText.substring(start, start + nGramLength);
}
// Read theText char by char to build a OrderedMaps where
// every possible nGram exists with the list of followers.
// This method need these three instance variables:
//
nGramLength
theText all
private void setUpMap() {
// TODO: Implement this method
}
// Print chars random characters. Please insert line breaks to make your
// output readable to the poor grader :-)
void printRandom(int howMany) {
// TODO: Implement this method
}
}
Grading Criteria 50 pts max. Turn this in to the D2L Drop Box RandomWrite
___/ +50 Generates text that is gets closer to the original as the nGram increases (subjective). For example,
when nGram length = 2, a few words may appear; but when 12, some sentences appear close to the original
text.
 -50 If no text is generated with a printRandom(500) message
 -40 If you did not use a HashMap<K, V> and the algorithm presented in section that uses a Map to
set up all nGrams and the list of followers for each before printing
 -40 If text has no apparent differences with different nGram lengths or file input
 -40 Output does not seem reasonable