Download Programming for Biologists BASH – in

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts
no text concepts found
Transcript
Programming for Biologists
BASH – in-class assignment
October 29, 2014
Credit. The following assignment is for participation credit (not graded). Please return
to me by Monday morning, November 3rd.
PART 1
1) Create a new directory called bash_practice. Cd into this directory.
2) From this directory, create a bash script called make_dirs.sh that creates 100
directories called dir1, dir2, etc.
3) Make your bash_script executable, and test it by running it.
4) Add to your script:
Create a file called myfile.1.dat, myfile.2.dat, etc. inside each dir. For
example, dir1 should contain myfile.1.dat (only). Test it and make sure it
works.
5) Add to your script:
Inside each file, add the text, “This is file number 1.” (where “1” is replaced with
the respective file numbers – e.g., inside myfile.1.dat should be the text,
“This is file number 1.”). Test it and make sure it works.
PART 2
Copy /home/b/bio6297/eo13/files_for_students/results
from my directory on xanadu to your home account or your personal machine.
Background. Imagine that you recently ran some molecular evolution analyses, and
the results consist of text files, each containing the results for a single gene. Example
results can be found in the results directory.
Your Task.
(1) Write a bash script that copies the ‘tgr’ gene files (genes that are members of the tgr
gene family in Dictyostelium) to a new directory called output_tgrs.
(Do this task first, and ensure that it works, before moving on to the next step.)
(2) Add to your existing script the following functionality:
Generate a list of files that contain error messages (“Error:”) and call it
tgrs_with_errors.
(3) Use command-line arguments to your script to allow the user to specify the input
directory, output directory, and output file.
Answer the following questions before writing your script.
1) Write some documentation at the start of the script describing its purpose and
syntax to run the program. To add lines to a file that are ignored by the interpreter
– called “comments”, you simply start the line with a “#”. For example:
# This line is a comment. The interpreter will ignore it.
# I can add additional lines, as long as they start with “#”
2) Decide what arguments the program must take as input and what outputs it will
produce. Add this information to the documentation at the start of the program.
3) Write the steps the program will take in pseudo-code. What is the first thing the
program needs to do? The second thing? What variables need to be created?
You can read more about pseudocode here:
https://www.khanacademy.org/computing/computer-programming/programming/goodpractices/p/pseudo-code
http://www.unf.edu/~broggio/cop2221/2221pseu.htm
http://users.csc.calpoly.edu/~jdalbey/SWE/pdl_std.html
Some suggestions about how to tackle this program, step-by-step:
1) First write the code that reads in the inputs and creates the required outputs (files or
directories). Check that a program that only reads in the inputs and creates empty
outputs works.
2) Once you know the correct inputs go in, and correct output holders go out (i.e., files
and directories are created), then decide what (if any) loop you would want to use,
and what it should accomplish. Try things on a small scale first – e.g., without a
loop, just creating dir1 and myfile.1.dat (and adding the line to that file). Once
that works, then try to wrap it into a loop that does it for a range of numbers.
3) Try creating a skeleton loop to your program – check that it works by having it echo
something from inside the loop, e.g., “Here I am inside the loop.” Run this program.
How many times does it go through the loop… and does this make sense?
4) Try having the loop report something else that will tell you if it is working. For
example, if the loop is supposed to go through each file in a directory, ask that it
reports (use echo) the filename each time. Run the program and see if the expected
behavior occurs.
5) Finally, if all of these steps work, and you are confident the loop is entered and
exited correctly, then (and only then!) try adding additional code inside the loop that
does what you want it to do. Keep using echo to report what it is doing. When you
have finally gotten everything working, remove all of these “echo” commands and
check that the streamlined code gives you the same result.