Goal of this homework: You'll implement some methods for building and traversing trees, and will learn to think carefully about handling input errors. You'll also have to show a little more independence in this assignment, while continuing to follow good design practices!
Collaboration policy for this homework: You must follow the CS Department Academic Integrity Code and this time may not collaborate with another student (in contrast to the previous homework). Of course, you may get help from the course staff.
How to hand in your work: instructions, upload page
What to hand in (a directory, all packaged up in a zip or tar file):
Your commented code: Any .java
files that
are necessary to make your program run correctly. Please make sure
that you include everything that is needed to make your code compile!
Note that we will test your TreeExpression
class
through the Expression
interface, just as the
ViewTree
test class does. If you do the extra credit,
also hand in FullParenTreeExpression
.
Readme
file is not required for this assignment,
but you may include one if there's anything you want to tell us.
Acknowledgments: Many thanks to Josh Dunkelman and Prof. Jonathan Cohen for their help in preparing this assignment.
How does your Java compiler (javac
) make sense of
expressions like ((3+4)*((5^6)/7))
? In this assignment,
you will write a constructor that turns an expression from a string
into a tree. You will also write methods that operate on the tree.
These methods will find the value of the expression
(15625
) and turn the expression back into a tree in
preorder, inorder, and postorder forms.
You are provided with a ViewTree
class to test your
methods. The command java ViewTree '((3+4)*((5^6)/7))'
will bring up the following window. The drawing is produced by our
ViewTree
code (using a technique explained in the textbook).
But the values in the four lines at the bottom are produced by calling
your code.
It is possible that you will extend this code in a later assignment, for example, to allow expressions with variables in them.
The expressions input to your program should appear as a form of
infix notation. However, you only have to deal with fully
parenthesized expressions. Thus, 5+2*4
is not a
legal input; it must be written as (5+(2*4))
. This saves
you from having to deal with operator precedence rules (otherwise
known as "order of operations").
The legal inputs can be defined recursively as follows:
There are more numbers in the world than 0, 1, 2, 3, 4, 5, 6,
7, 8, and 9! Any printed representation of a double
is a
legal expression string. This means you must handle complicated
representations like -6.324e+05
(Java's scientific
notation for -6.324 x 105). Fortunately, Java provides
methods for converting from strings to numbers. For example, use
new Double(s)
where s
is the string
"-6.324e+05"
. See more hints in the "Resources" section
below.
If s1
and s2
are legal
expression strings, then so are the following five parenthesized
strings: (s1+s2)
,
(s1-s2)
,
(s1*s2)
,
(s1/s2)
,
(s1^s2)
. These are the basic binary
arithmetic operations, including the ^
exponentiation
operator (implemented by the static method Math.pow()
).
Whitespace may appear anywhere in a string, except inside a number
(e.g., 6.3 24 e+05
is not legal).
Some examples of valid expressions:
5
(5+8)
((5+8)*3)
( (5+ 8)*( 3 + 5 ))
((5+(2^4))
*(3+5))
(the newline
in the middle of this expression should be ignored just like spaces or tabs)
(-3e2--5.2)
(that is, (-300) - (-5.2))
When encountering an invalid expression, you should throw an
InvalidExpressionException
whose string argument is a
clear error message. This string should tell the user what
problem you encountered and where you encountered it. Your
error messages should be comparable in their level of detail to the
following, though yours might be different (depending on how your
solution realizes that there's an error):
((5+2*4)*(3+5))
: expected a close parenthesis after "((5+2
"
((5+2
: expected a close parenthesis after "((5+2
"
((5+2)(3+5))
: expected an operator after "((5+2)
"
((5+2)x(3+5))
: expected an operator after "((5+2)
"
((5+2))*(3+5))
: expected an operator after "((5+2)
"
((5+2)
: expected an operator after "((5+2)
"
((5+(2*x))*(3+5))
: expected a number or subexpression
after
"((5+(2*
" [note: you will have to catch
NumberFormatException
to handle this case]
((5+(2*
: expected a number or subexpression after "((5+(2*
"
((5+(
: expected a number or subexpression after "((5+(
"
"
(5+(2*4))(3+5)
: found more stuff after end of expression "(5+(2*4))
"
5(3+5)
: found more stuff after end of expression "5
"
5 .3
: found more stuff after end of expression "5
"
You should start by writing a class (you can choose the name) that
implements the following BinaryTree
interface:
public interface BinaryTree { public Position root(); public Position parent(Position v); // should return null if no parent (root) public Position leftChild(Position v); // should return null if no left child public Position rightChild(Position v); // should return null if no right child public boolean isRoot(Position v); public boolean isExternal(Position v); public boolean isInternal(Position v); public void createExternal(Position v); // v is an external node; give it left // and right children with null elements, // thereby making it internal public void replaceElement(Position v, Object e); } |
You may implement the above BinaryTree
interface in any way you
choose (you are welcome to use and modify any code from the course
textbook, but not from other sources). The nodes should implement the
Position
interface in the book (supporting the
element()
method). Thus you may choose to use either
nodes with parent/child reference links or else a vector-based (heap-style)
implementation as described in the textbook.
Here is all the
code in the textbook so that you don't have to type it in
yourself. However, you will have to figure out which code to use.
Also, our interface differs slightly from the textbook's -- the
parent
, leftChild
, and
rightChild
methods return a null
Position
instead of throwing a BoundaryViolationException
-- and
you will have to follow our interface since that's what
ViewTree
expects.
Now you should write an TreeExpression
class that
implements the following Expression
interface:
/** Represents a arithmetic expression. */ public interface Expression { /** Converts to a string in prefix notation. * This can be accomplished by a preorder traversal of the tree. */ public String prefix(); /** Converts to a string in infix notation. */ public String infix(); /** Converts to a string in postfix notation. */ public String postfix(); /** Finds the value of the expression. */ public double value(); } |
Your TreeExpression
class should implement not only
Expression
but also BinaryTree
, presumably
by extending your binary tree class. That is so that when the
ViewTree
class wants to display an expression's tree, it
will be able to call methods of BinaryTree
on the
expression.
Remember that TreeExpression
extends your binary
tree class. That means that every TreeExpression
actually is a binary tree (augmented with some extra methods
like infix()
, so that it also implements Expression
).
It doesn't have to contain another binary tree.
Your TreeExpression
class should have a constructor that takes a string argument
such as "( (5+ 8)*( 3 +5 ))"
. ViewTree
will
call this constructor. If the string is invalid, the constructor
should throw an InvalidExpressionException
that contains
an appropriately detailed error message as described above. This is
the only exception that ViewTree
knows how to catch, so
you should be careful not to throw any others:
/** Thrown in response to a string that is not a legal * arithmetic expression. The String argument may be shown * to the user, so it should clearly explain what and where * the trouble is. */ public class InvalidExpressionException extends Exception { public InvalidExpressionException(String err) { super(err); // just call the general Exception constructor } } |
Note that this is an ordinary checked exception, meaning
that any method that throws the exception must declare throws
InvalidExpressionException
in its signature. This reminds a
programmer using the method that it might throw this exception, and it
forces her to either handle it (with catch
) or explicitly
declare that she would throw it further up. Although we have been
using some unchecked exceptions in recent assignments (this is done by
extending RuntimeException
instead of
Exception
), they are rare - you should only use them for
exceptions that most callers wouldn't want to handle. Here's some discussion
of this point.)
You'll probably want to test your constructor right away using the
ViewTree
class. So write temporary stubs for the other
methods. (These stubs should just return bogus values like
"unknown"
and 0
.) Then you should at least be able
to see the tree.
Once your constructor is working properly, you should fill in the
other methods of the interface. There is some discussion in the
textbook about how to do this. Now test those methods using
ViewTree
, too.
Of course, you are not a helpless slave of the
ViewTree
class: we just gave it to you to be nice. If
things don't work, or if you want a more thorough test than
ViewTree
provides, then write your own test code. For
example, once you've written TreeExpression.prefix()
, you
can and should call it from your own test method
TreeExpression.main()
.
Note: If you get a NullPointerException
from
TreeArea
, it is probably thrown by
p.element().toString()
, in which case your problem is
either that p==null
or p.element()==null
.
Here p
is a Position
that
ViewTree
has found by exploring your
TreeExpression
, starting with p=te.root()
and recursing to both te.leftChild(p)
and
te.rightChild(p)
as long as
te.isInternal(p)
. Obviously, ViewTree
must
explore the tree like this in order to find and draw all the positions
(nodes). But it should not be able to find a null
position in this way, or a position with a null
element.
If it does, there is surely something wrong with your code, resulting
in that NullPointerException
. The best way to figure out
what is going wrong is to write your
own test code that also explores the tree, and see where
your code gets into trouble.
The positions of your tree store Objects
as their
elements, of course. But what kind of objects? You should probably
store a Double
at each external node, and a
Character
such as +
at each input node.
(You should remember from homework 3 how to
convert between char
and Character
.
Converting between double
and Double
is
similar. These classes are in the java.lang
package;
refer as necessary to their
documentation.)
How do you convert a fully parenthesized expression string into a
tree? If the string is legal, one of these two cases must hold. (a)
The string is a number, in which case you can just return a one-node
tree. (b) The string starts with a left parenthesis, in which case it
must have the form (s1 c s2)
where
c
is an operator character. There are at least
three possible strategies:
In case (b) above, you can scan the string to find the operator
character, going left to right and counting +1 for left parentheses
and -1 for right parentheses. Now use the substring()
method to extract the two subexpressions that are arguments to that
operator. Recursively build the left and right child subtrees out of
these subexpressions.
This is not as inefficient as it may seem, since creating a substring takes O(1) time. It doesn't require any copying of characters; internally, a substring is represented as a pointer to the original string together with a pair of integers that indicate the extent of the substring.
But it is nonetheless a somewhat time-inefficient strategy: each character will be repeatedly scanned (though not copied), once for every expression it's part of.
If you use this strategy, be especially careful to handle whitespace. Also note that your error messages are likely to look a bit different from the examples given earlier.
This strategy also has to be careful about minus and plus symbols. The
-
symbols in -3.45
and
3.45e-67
are not subtraction operators,
and the +
symbol in "6.324e+05"
is not
an addition operator. Probably the easiest solution is to look at
the character before the minus or plus symbol (ignoring whitespace);
that should let you figure out what kind of minus or plus it is.
You can write a protected method eatExpression
,
which removes the smallest complete expression from the front of
the string and returns a tree for it. eatExpression
can be implemented recursively. If the string is just a number,
you return a 1-node tree. If it has the form (s1
c s2)
, then you do 5 steps: eat a left
parenthesis, recursively eat a subexpression s1
,
eat an operator character c
, recursively eat a
second subexpression s2
, and finally eat the right
parenthesis. You can combine the operator with the two subexpression
trees to get a tree to return.
If any of these steps go wrong, it signals that something is wrong with the input string, so you should throw an exception with a detailed error message. For example, you might be ready to eat an operator character, but there isn't one at the front of the string for you.
The previous two techniques use recursion. Recursion is a very nice way of keeping track of where you are and how to assemble the results. After all, when a method call returns, its caller remembers where to put the result in the tree and how to continue with its own computation.
But you can also solve the problem without recursion. The idea is
to eat the string from left to right, building the tree as you go.
Your "current node" in the tree keeps track of where you are in the
computation. If you eat a number, it goes at the current node. If
you eat a left parenthesis, you make the current external node
internal (by giving it children with createExternal()
),
and descend to the new left child, which becomes the current node. If
you eat an operator, then ... well, you get the idea.
Again, if any of these steps go wrong, it signals a problem and you should throw an exception with a detailed error message. For example, the input string might inappropriately "tell" you to add children to a node that is already internal, or add an element to a node that already has one, or ... what else?
Regardless of which strategy you choose, your error messages should
be designed to be useful to the user. Don't just report something
like can't add children to an internal node
. Tell the
user what is wrong with the string so that he/she knows how to fix
it!
Download the six ViewTree
.class
files to your current directory or CLASSPATH directory. (Or
download the single zipfile containing them.)
As already mentioned, you can use code from the textbook.
Use String.charAt()
to extract a single char from a string, and String.substring()
to get an arbitrary-length substring.
The String.trim()
method is very helpful for eliminating whitespace at the beginning or
end of a string. Or just use the lower-level method Character.isWhitespace()
as a portable way to find out whether a single character is a
whitespace character (typically space, tab, or newline).
You might find the StringBuffer
class useful for efficiently building up a long string, for example
for the prefix
, infix
, and
postfix
methods. (You can get away without it,
though.)
You can convert a string to a double using Double.parseDouble()
.
But two of the strategies in the previous section require you to "eat"
characters and doubles from the front of a string. This means you
must remember your current position in the string (i.e., how much have
you eaten so far?). Here are three options:
You could implement this yourself.
Java provides a family of java.io.Reader
classes. These support a read()
method for eating
successive characters from a string (new StringReader("my input
string")
), a file (new FileReader("filename")
), or
any other stream. A Reader object keeps track of where it is in the
string or file. If you use a PushbackReader
, you can use
unread()
to spit unwanted characters back onto the front
of the string after you eat them and decide they taste wrong for the
current stage of your computation.
Java's StringReader
only knows how to eat
individual characters from the front of a string. But the
scanf
function in C and the >>
operator in C++ know how to eat an entire double
(like
-6.324e+05
) and return its value. These methods
are also able to skip whitespace.
This is a surprising omission in Java. Figuring someone must have
remedied it, I searched the net and eventually found a
DataReader
class described in this
JavaWorld article. I have improved this class slightly for you;
you can download it here, including
brief instructions and a small example illustrating
readDouble()
and peek()
.
(Alternatively, you can get a scanf
workalike for
Java, but this is less in the Java spirit.)
Fully parenthesized expressions make things easier for you, but harder for the user. The point of this extra credit challenge is to deal with strings that are not necessarily fully parenthesized. As before, you want to convert back and forth between strings and expression trees.
Rename your old TreeExpression
class (and its file) to
FullParenTreeExpression
. Now write a new
TreeExpression
class that extends
FullParenTreeExpression
. The new class should behave
differently in two ways:
It should override its parent's infix()
method.
The new infix()
method should convert a tree to a string
that has as few parentheses as possible. For example, (new
TreeExpression("((5+3)*4)")).infix()
should return the string
"(5+3)*4
, and (new
TreeExpression("((5*3)+4)")).infix()
should return just
5*3+4
, since that has the same meaning, given the usual
rules about operator precedence ("order of operations").
We discussed a partial solution to this in lecture. It's not too
hard. However, be careful about the minus and divide operators.
((5-4)-3)
should come out as 5-4-3
, but
(5-(4-3))
shouldn't!
Its constructor should be able to create a tree from expression strings even if they are not fully parenthesized. Again, you must follow the usual rules about operator precedence. To find out how to do this, search the web to find a good discussion of recursive descent. Some discussions specifically talk about parsing arithmetic expressions, which is an easy case.
You can test your improved TreeExpression
class with
ViewTree
as before. If you only handle one of the two
bullet points above, you will still get extra credit; more if you
handle both.
Jason Eisner - jason@cs.jhu.edu
- $Date: 2004/03/08 07:18:31 $