The third assignment is mostly about sorting and how fast things go. You will also write yet another implementation of the Array
interface to help you analyze how many array operations various sorting algorithms perform.
Note: The grading criteria now includes points for unit testing with JUnit - so you need to be able to work with JUnit for this assignment. This refers to JUnit 4 test drivers, not some custom test program you hacked. Assignments will specify what JUnit test drivers you should improve/add-to.
This assignment has a slightly more advanced package setup. This time, since we rely on our SimpleArray
from the hw2
package, we provide you with the necessary files from homework 2 in exceptions
and hw2
. They are the same as last assignment. For this assignment, you have a new directory data
with the data files needed for Part B, and hw3
with skeleton code for all parts.
Files you will be editing are marked with an asterisk (*).
hw3-student.zip
--
exceptions/
IndexException.java
LengthException.java
hw2/
Array.java
SimpleArray.java
data/
random.data
ascending.data
descending.data
hw3/
BubbleSort.java *
GnomeSort.java
InsertionSort.java *
NullSort.java
PolySort.java *
SelectionSort.java
SortingAlgorithm.java
Measured.java
MeasuredArray.java *
MeasuredArrayTest.java *
These provided files should compile as is. You can compile everything by running $ javac -Xlint:all hw3/*.java hw2/*.java exceptions/*.java
, or using the generic compile script from the Java Command Line notes. You should only need to compile exceptions/*.java
and hw2/*.java
one time.
Your first task for this assignment is to develop a new kind of Array
implementation that keeps track of how many access and mutate operations have been performed on it. It also counts the number of occurrences of a particular value in the Array
. Check out the Measured
interface first, reproduced here in compressed form (be sure to read and use the full interface:
This describes what we expect of an object that can collect statistics about itself. After a Measured
object has been “in use” for a while, we can check how many access and mutate operations it has been asked to perform, through the accesses
and mutations
methods, respectively. We can also tell it to “forget” what has happened before and start counting both kinds of operations from zero again using the reset
method.
You need to develop a class MeasuredArray
that extends our dear old SimpleArray
and also implements the Measured
interface; yes, both at the same time. When a MeasuredArray
is created, you initialize internal counters to keep track of the number of access and mutate operations it has been asked to perform so far; obviously both counts start at zero. You will need to override the accessor and mutator methods of the class so that the relevant counter is incremented each time that type of operation succeeds. The overridden methods must also call the the actual operation in the super class using Java’s super
keyword. (Rewriting these methods instead of using inheritance properly will result in significant style deductions, as well as compiler issues since you can’t change the provided SimpleArray.java code.)
Don’t forget that your constructor for MeasuredArray
will also have to invoke the SimpleArray
constructor! However, this operation is neither an accessor or a mutator.
Consider a freshly constructed MeasuredArray
object. It would return 0 for both accesses
and mutations
. Now imagine we call the length
operation followed by three calls to the get
operation. At this point, our object would return 4 for accesses
but still 0 for mutations
. If we now call the put
operation twice, the object would return 2 for mutations
but still 4 for accesses
. (You don’t have to check whether a put operation actually changes the value or not since that’s not how the put operation was originally written.)
The reset operation should set both the number of accesses and mutations back to 0. Lastly, implement the count
method which should determine and return the number of occurrences of the parameter value. Since it will always need to inspect every value in the array, it should also naturally update the accesses
value accordingly. [NOTE: be careful about this if you use an iterator to implement count - see discussion questions below.]
We provide you with SimpleArray
, but we leave it as part of the hw2
package. So, you should only need to compile the hw2
folder one time to be able to use SimpleArray
in your solution. In the skeleton code we already have the import taken care of (note the import hw2.SimpleArray
statement). You do not need to hand in SimpleArray
on Gradescope.
You will need to write JUnit 4 test cases for MeasuredArray
. Your focus should be on the Measured
aspect of the class (ie, reset
, accesses
, mutations
, count
), but you will need to call Array
and ArrayInterface
methods to trigger the various possible outcomes. The tests you write do not need to check that it is a working Array
implementation; we’ll do that ourselves. (Of course, you should test for yourselves that the methods do work correctly.) Later on we will show you a nice way to test with the same inheritance/interface-realization structure as the data structures you are testing, but for this first time we are just trying to get our feet wet with JUnit.
The file you need to add unit tests to is MeasuredArrayTest.java
. We provide you skeleton code with the basic @Before
and @Test
annotations. For this requirement you don’t need anything fancier than that. Make sure that you pass your own tests in your final deliverable. If you don’t pass your tests but want to receive credit for writing tests, comment those tests out. You will receive some autograder points for passing your own tests.
Since we are only concerned with testing Measured
, there isn’t any exception testing to cover. So, you don’t need to use the expected
parameter of the @Test
annotation. But, if you want to test some Array
axioms, you could have something like this:
and write code that should trigger an IndexException
.
For help running your JUnit4 tests, see the JavaCommandLineNotes and JUnitIntellij notes in Piazza Resources.
In your README for Part A, discuss from a design perspective whether or not iterating over a MeasuredArray
should affect the accesses and mutation counts. Note that for the purposes of this assignment we are NOT asking you to rewrite the ArrayIterator
to do so. However, if you wanted to include the next() and/or hasNext() methods in the statistics measured, can you inherit ArrayIterator
from SimpleArray
and override the relevant methods, or not? Explain.
super
, you should not be duplicating any code from SimpleArray
and should be making proper use of inhertiance.super
methods, when you call them is important. With the SimpleArray
we are concerned with reporting the number of successful reads and writes. If some operation causes an exception, that operation shouldn’t affect the statistics.@Test
method starts with a fresh instance created from the @Before
method.SimpleArray
may have one inevitable unchecked cast warning. When you compile, this one warning is fine, but your own code should not add any additional warnings.Your second task for this assignment is to explore some of the basic sorting algorithms and their analysis. All of these algorithms are quadratic in terms of their asymptotic performance, but they nevertheless differ in their actual performance.
We’ll focus on the following three algorithms:
The provided files contain a basic framework for evaluating sorting algorithms. You’ll need a working MeasuredArray
class from Problem 1, and you’ll need to understand the following interface as well (again compressed, be sure to to read and use the full interface:
public interface SortingAlgorithm<T extends Comparable<T>> {
void sort(Array<T> array);
String name();
}
Let’s look at the simple stuff first:
An object is considered an algorithm suitable for sorting in this framework if (a) we can ask it to sort a given Array
and (b) we can ask it for its name (e.g. “Insertion Sort”).
The more complicated stuff is at the top: The use of extends
inside the angle brackets means that any type T
we want to sort must implement the interface Comparable
as well. It obviously can’t just be any old type, it must be a type for which the expression “a is less than b” actually makes sense. Using Comparable
in this form is Java’s way of saying that we can order the objects; you should probably read up on the details here!
We provide a PolySort.java
that runs the various sorting algorithms on input data and reports on it’s statistics. We also provide a working GnomeSort
implementation of SortingAlgorithm
, a working SelectionSort
, and a NullSort
that doesn’t actually do anything. GnomeSort
is an intentionally inefficient sorting algorithm that we can use for comparison.
Your first task is to implement BubbleSort
and InsertionSort
in the provided Java files. Note that these classes implement SortingAlgorithm
.
PolySort
takes one or two command line arguments - the name of the file to read is required, and the number of Strings to read in from standard input is optional. It then runs the sorting algorithm for each implementation and reports some statistics. The following is an example invocation:
$ java PolySort random.data 4000
Algorithm Sorted? Size Accesses Mutations Seconds
Null Sort false 4,000 0 0 0.000007
Gnome Sort true 4,000 32,195,307 8,045,828 0.243852
Selection Sort true 4,000 24,009,991 7,992 0.252085
This will read the first 4000 strings from the file random.data
and sort them using all available algorithms. As you can see, the program checks if the algorithm actually worked (Sorted?) and reports how many operations of the underlying MeasuredArray
were used in order to perform the sort (Accesses, Mutations). Finally, the program also prints out how long it took to sort the array (Seconds) but that number will vary widely across machines so you can really only use it for relative comparisons on the machine actually running the experiment. It’s hard to use time as an actual benchmark.
Your second task is to add code to PolySort
that times how long it takes the Java Collections library to perform a sort.
There are many tools in the Java Collections that store data and implement sorting. We would like you to use Collections.util.ArrayList to store the strings and Collections.sort to sort the strings from the file. It may be helpful to look at the documentation for Collections
and ArrayList
. Please print out the timing results for sorting using Java collections in the same format as above, with accesses and mutations set to 0, directly after the algorithms you implemented (with no spaces/new lines in between). For example:
$ java PolySort random.data 4000
Algorithm Sorted? Size Accesses Mutations Seconds
Null Sort false 4,000 0 0 0.000007
Gnome Sort true 4,000 32,195,307 8,045,828 0.243852
Selection Sort true 4,000 24,009,991 7,992 0.252085
(Your implementations here)
Java Collections true 4,000 0 0 1.234567
The emphasis of this problem is not the coding work. Rather it is on evaluating and comparing the sorting algorithms on different sets of data. We’ve provided three different data sets, and you can vary the command line argument to experiment with different sizes as well.
There is an intentional mistake within one of the provided data files. The goal of this assignment is to use the measurements to catch that mistake. If you catch it, please discuss it in your README but avoid posting on Piazza so that other students can make the same connection on their own. You might also want to correct the problem and rerun your tests on that file. Report and discuss all your run results.
In your README
file you should describe the series of experiments you ran, what data you collected, and what your conclusions about the relative performance of these algorithms are. Specifically, you should address the following:
If you are using markdown formatting in your README, and are including data in table format, please make sure that it is still readable in ascii as the graders will not have a markdown-rendered view.
BubbleSort
breaks early as soon as it knows the array is sorted.InsertionSort
so you just copy your BubbleSort
code into InsertionSort.java
, you won’t get credit for InsertionSort
.Your final task for this assignment is to analyze the following descending selection sort algorithm mathematically (without running it) in detail (without using O-notation).
Here’s the code, and you must analyze exactly this code (the line numbers are given so you can refer to them in your writeup for this problem):
1: public static void selectionSort(int[] a) {
2: int max, temp;
3: for (int i = 0; i < a.length - 1; i++) {
4: max = i;
5: for (int j = i + 1; j < a.length; j++) {
6: if (a[j] > a[max]) {
7: max = j;
8: }
9: }
10: temp = a[i];
11: a[i] = a[max];
12: a[max] = temp;
13: }
14: }
You need to determine exactly how many comparisons C(n) and assignments A(n) are performed by this implementation of selection sort in the worst case. Both of those should be polynomials of degree 2 since you know that the asymptotic complexity of selection sort is O(n^2). (As usual we refer to the size of the problem, which is the length of the array to be sorted here, as “n” above.) Don’t forget to include the operations that control the loops.
Important: Don’t just state the polynomials, your writeup has to explain how you derived them, ideally line by line! Anyone can google for the answer, but you need to convince us that you actually did the work
The files you have // TODO
items in are listed explicitly below:
BubbleSort.java
InsertionSort.java
MeasuredArray.java
MeasuredArrayTest.java
PolySort.java
You need to submit all of these files to the autograder along with a README. You can upload them individually or in a zip file. If you upload them in a zip file make sure they are all at the top level, you cannot have any extra directories or else the autograder won’t be able to find them.
Make sure the code you hand in does not produce any extraneous debugging output. If you have commented out lines of code that no longer serve any purpose you should remove them.
You must hand in the source code and a README file. The README file can be plain text (README
with no extension), or markdown (README.md
). In your README be sure to answer the discussion questions posed in this description. You should discuss your solution as a whole and let the staff know anything important. If you are going to be using late days on an assignment, we ask that you note it in your README.
If you want to learn markdown formatting, here is a good starting point.
Once you are ready to submit your files, go to the assignment 3 page for Gradescope and click submit. Note that you can resubmit any time up until the deadline. Only your most recent submission will be graded. Please refer to course policies as far as policies regarding late days and penalties.
After you submit, the autograder will run and you will get feedback on your functionality and how you performed on our test cases. Some test cases are “hidden” from you so you won’t actually know your final score on the test cases until after grades are released. We also include your checkstyle score as a test case.
If you see the “Autograder Failed to Execute” message, then either your submission did not compile at all or there was a packaging error. Please see the Gradescope Submission Notes in Piazza Resources for help debugging why your submission is not working.
You do not need to fully implement each file before you submit, but you’ll probably fail the test cases for the parts of the assignment you haven’t done yet. Also note that only the files with // TODO
items in them will be used. You cannot modify any of the provided interface files as the autograder will overwrite any changes you made with the original provided file.
For reference, here is a short explanation of the grading criteria; some of the criteria don’t apply to all problems, and not all of the criteria are used on all assignments.
Packaging refers to the proper organization of the stuff you hand in, following both the guidelines for Deliverables above as well as the general submission instructions for assignments.
Style refers to Java programming style, including things like consistent indentation, appropriate identifier names, useful comments, suitable javadoc
documentation, etc. Many aspects of this are enforced automatically by Checkstyle when run with the provided configuration file.
public
, protected
, and private
appropriately, etc.). Simple, clean, readable code is what you should be aiming for.Testing refers to proper unit tests for all of the data structure classes you developed for this assignment, using the JUnit 4 framework as introduced in lecture. Make sure you test all parts of the implementation that you can think of and all exception conditions that are relevant.
Performance refers to how fast/with how little memory your program can produce the required results compared to other submissions.
Functionality refers to your programs being able to do what they should according to the specification given above; if the specification is ambiguous and you had to make a certain choice, defend that choice in your README
file.
If your submission does not compile, you will not receive any of the autograded-points for that assignment. It is always better to submit code that at least compiles. You will get freebie points just for compiling.
If your programs have unnecessary warnings when using javac -Xlint:all
you will be penalized 10% functionality per failed part. (You are also unable to use the @SuppressWarnings
annotation - we use it just to filter our accepted warnings from yours.)
If your programs fail because of an unexpected exception, you will be penalized 10% functionality per failed part. (You are not allowed to just wrap your whole program in to a universal try-catch.)