project2 : Reading file/URL inputs and processing/formatting the results

num ready? description assigned due
project2 true Reading file/URL inputs and processing/formatting the results Fri 05/19 08:00AM Tue 06/06 11:59PM

CS 8, Spring 2017: Programming Project 2

Due: Tuesday, 6/6 11:59pm

It is optional, but highly recommended, to work with a partner on this project.

CountChars.py

(100 points: reading from file or URL, formatting strings, data structures)

Begin a new file named CountChars.py by typing a comment at the top with your name (and the name of your lab partner if you work together), and the current date.

You will notice the instructions are a bit shorter this time. You are given a problem to solve, but you must figure out how to solve it yourself. Keep in mind what you have learned about top-down programming by stepwise refinement: break the problem into meaningful parts; then solve each part separately. And have fun!

The problem is to write a Python3 program that reads a text file or web page named by the user, and prints a neat table showing the counts of all of the different characters it reads. You must also meet the following explicit specifications:

Since we insist that your output will exactly match the results of our solution (see the 3 examples below), copy the following string and dictionary constants and paste them to the beginning of countchars.py (after your header comment of course):

# strings that must be used for output:
prompt = "Enter filename: "
titles = "char       count\n----       -----"
itemfmt = "{0:5s}{1:10d}"
totalfmt = "total{0:10d}"
whiteSpace = {' ':'space', '\t':'tab', '\n':'nline', '\r':'crtn'}

Make it so that when you run the module in IDLE, it should immediately instruct the user to enter the filename, and then it should print the table of counts in the Python shell window. Alternatively, a user can run the program directly from the command prompt without ever starting Python or IDLE, by typing the following:

-bash-4.2$ python3 CountChars.py

Execution must proceed as follows:

  1. Use the built-in function input to get the filename from the user. Pass the string named prompt to this function.
  2. If the filename does not begin with "http" then assume it is a local file, and open it for reading with the built-in open function. Otherwise import and use the urllib.request.urlopen function to open the web page for reading. The program has to be able to do this automatically.
  3. Read all of the text in the file or web page, and count each different character in it. Ignore the case of characters (e.g., count both 'A' and 'a' as 'a').
  4. Print the string named titles. Then print the table of characters and their counts in ASCII order, and print the total character count at the bottom. Hint: There are 128 characters in the ASCII code, starting from 0 and ending at 127.
  5. Use the string named itemfmt to print each interior row of the table in the proper format. For example, if c is a character, and ccount is its count, then print that row of the table as follows:
  6. print( itemfmt.format(c, ccount) )
    
  7. If the character being printed is one of the white space characters in the dictionary named whitespace, then print its description instead of the character itself. For example:
  8. print( itemfmt.format(whiteSpace[c], ccount) )
    
  9. Use the string named totalfmt to properly print the total character count. If this count is named total, for example:
  10. print( totalfmt.format(total) )
    
  11. Fully test your program. Here are sample input files and web pages, and the associated results from our solution. Make sure that your results exactly match these results. Also realize that the graders will probably test other inputs too.
File/PageProgram Run
short.txt short.txt run
longer.txt longer.txt run
http://cs.ucsb.edu/~zmatni/cs8s17/index.html CS8 Home Page run

These files can also be located at: http://www.cs.ucsb.edu/~zmatni/cs8s17/projects/proj2/.


Go to CSIL (in person unless you can manage this step remotely without any assistance from us). Open a terminal window, cd to the same directory as your source code files, then type the following (careful - turning in to uppercase P, Proj2, at class account cs8):

turnin Proj2@cs8 CountChars.py

If you are working with a partner, be sure that both partners names are in the comments at the top of all source files.

If you run into problems, be sure to ask questions on Piazza or visit your TA’s or your instructor’s office hours.


Created by Ziad Matni, (c) 2107, partly based on prior work by M. Costanzo and others.