This algorithm corresponds exactly to the one you saw on the lecture slides, but this presentation gives a somewhat different perspective along with some further discussion.
In Homework 1, you'll get to convert some formulas to CNF by hand. (Or if you want, you could implement the algorithm below and implement an automatic converter!)
In lecture, we allowed 7 kinds of propositional formulas in our little language. It might be helpful to think about how you would implement formulas in a language like Java. You'd want a recursive datatype similar to trees, roughly like this:
abstract class Formula { } class Variable extends Formula { String varname; } class AndFormula extends Formula { Formula p; Formula q; } // conjunction class OrFormula extends Formula { Formula p; Formula q; } // disjunction class NotFormula extends Formula { Formula p; } // negation class ImpliesFormula extends Formula { Formula p; Formula q; } // if-then class EquivFormula extends Formula { Formula p; Formula q; } class XorFormula extends Formula { Formula p; Formula q; }
Thus, if φ is an instance of Formula
, we can ask whether it is an
instanceof the subclass AndFormula
. If so, we can look at its two
conjuncts, P
and Q
, which are themselves instances of Formula
.
A CNF formula is a restricted special case. It is a conjunction of "clauses," each of which is a disjunction of "literals," each of which is either a variable or a negated variable.
Now here's a routine to convert any formula to CNF.
CONVERT(φ): // returns a CNF formula equivalent to φ // Any syntactically valid propositional formula φ must fall into // exactly one of the following 7 cases (that is, it is an instanceof // one of the 7 subclasses of Formula). If φ is a variable, then: return φ. // this is a CNF formula consisting of 1 clause that contains 1 literal If φ has the form P ^ Q, then: CONVERT(P) must have the form P1 ^ P2 ^ ... ^ Pm, and CONVERT(Q) must have the form Q1 ^ Q2 ^ ... ^ Qn, where all the Pi and Qi are disjunctions of literals. So return P1 ^ P2 ^ ... ^ Pm ^ Q1 ^ Q2 ^ ... ^ Qn. If φ has the form P v Q, then: CONVERT(P) must have the form P1 ^ P2 ^ ... ^ Pm, and CONVERT(Q) must have the form Q1 ^ Q2 ^ ... ^ Qn, where all the Pi and Qi are dijunctions of literals. So we need a CNF formula equivalent to (P1 ^ P2 ^ ... ^ Pm) v (Q1 ^ Q2 ^ ... ^ Qn). So return (P1 v Q1) ^ (P1 v Q2) ^ ... ^ (P1 v Qn) ^ (P2 v Q1) ^ (P2 v Q2) ^ ... ^ (P2 v Qn) ... ^ (Pm v Q1) ^ (Pm v Q2) ^ ... ^ (Pm v Qn) If φ has the form ~(...), then: If φ has the form ~A for some variable A, then return φ. If φ has the form ~(~P), then return CONVERT(P). // double negation If φ has the form ~(P ^ Q), then return CONVERT(~P v ~Q). // de Morgan's Law If φ has the form ~(P v Q), then return CONVERT(~P ^ ~Q). // de Morgan's Law If φ has the form P -> Q, then: Return CONVERT(~P v Q). // equivalent If φ has the form P <-> Q, then: Return CONVERT((P ^ Q) v (~P ^ ~Q)). If φ has the form P xor Q, then: Return CONVERT((P ^ ~Q) v (~P ^ Q)).
See why this recursive algorithm works? It must return a correct answer, assuming it doesn't recurse forever.
(You could prove that it won't recurse forever. The idea is to use induction on the depth of the formula φ, measuring depth in a particular way.)
Unfortunately, the case of P v Q
above
is very bad. If CONVERT(P) v CONVERT(Q)
is a formula of size O(m+n), the result of
CONVERT(P v Q)
can be a formula of size O(mn), which
is in general much worse.
Moreover, m and n may already be large, since P and Q may
themselves contain v. Unfortunately this means that CONVERT(φ)
may be exponentially larger than φ: consider the case where
φ = (P1 ^ P2) v (Q1 ^ Q2) v (R1 ^ R2) v ...
Thus, for the P v Q
case in the above pseudocode,
it is preferable to use the above naive method only if m=1 (or n=1).
CONVERT((Z ->
P) ^ (~Z -> Q))
. This formula is satisfiable if and only if
φ is satisfiable, since any assignment must set Z to be either
true or false, and then the formula accordingly requires the
assignment to satisfy either P or Q, respectively.
This recursive call to CONVERT
will end up calling CONVERT(~Z v P)
and
CONVERT(Z v Q)
, each of which will trigger only the m=1
case since ~Z and Z are literals.
As a result, if φ starts out with only v, ^, ~,
then CONVERT(φ)
is at worst quadratically
larger than φ (see lecture slides for an intuition as to why, and
try to prove it!). So even though this alternative technique does
result in a formula with additional variables, it keeps the total
formula length far smaller.
Note: We still have problems with P <-> Q
and P xor
Q
, since our pseudocode replaces these with expressions that
are more than twice as long. That can lead to exponential blowup if
P and Q themselves contain <->
and xor
.
You can avoid this problem by using the Tseitin transformation
(see below).
The pseudocode above was written as if CONVERT
returns a Formula
instance that just happens to be in CNF.
But it would be more efficient for CONVERT
to return an instance of
the following CNFFormula
class, which can only represent CNF formulas
but does so more compactly, taking advantage of their restricted structure:
class CNFFormula { Vectorclauses; } class Clause { Vector literals; } class Literal { String varname; boolean is_negated; }
Then a code fragment like this
If φ has the form P ^ Q, then: CONVERT(P) must have the form P1 ^ P2 ^ ... ^ Pm, and CONVERT(Q) must have the form Q1 ^ Q2 ^ ... ^ Qn, where all the Pi and Qi are disjunctions of literals. So return P1 ^ P2 ^ ... ^ Pm ^ Q1 ^ Q2 ^ ... ^ Qn.
would come out looking like this:
If φ has the form P ^ Q, then: return new CNFFormula(concatenate(CONVERT(P).clauses, CONVERT(Q).clauses))
assuming that CNFFormula
has a constructor that takes
a list of clauses.
Here is another way of thinking about this whole thing.
Forget about Formula
and just do everything with CNFFormula
.
To convert φ = (X ^ Y) v ~Z
to CNF, we would just construct
φ as a CNFFormula
, using CNFFormula
constructors as follows:
φ = CNFFormula(or, CNFFormula(and, CNFFormula("X"), CNFFormula("Y")) CNFFormula(not, CNFFormula("Z")))
Every intermediate step returns a formula that has already been converted to CNF. So the user never sees a non-CNF formula at all, except for the non-CNF formula that is implicit in the nested constructor calls above.
The CNFFormula
constructors have to handle the same 7 cases as are in
the pseudocode above, and they do it using exactly the same methods.
CNFFormula(String varname) { make a 1-clause formula with 1 literal } CNFFormula(Operator op, CNFFormula P, CNFFormula Q) { if op is the "and" operator, then: return the concatenation of P and Q (each of which is a vector of clauses) if op is the "or" operator, then: ...
and so on, going through the 7 cases from above.
Note that when CNFFormula(and,P,Q)
combines P and Q, it doesn't have
to recursively convert them to CNFFormula
s as in the previous
approach, because in this approach they are already CNFFormula
s,
having been converted already at the time they were built.
If you take our CONVERT
routine above, and replace
each "and" operator with "or" and vice-versa, you wil get a routine
for converting to DNF instead of to CNF. Now, as remarked in lecture,
it's easy to write a SAT solver as
SAT_SOLVER(φ) = DNF_SAT_SOLVER(CONVERT_TO_DNF(φ))
since it's trivial to write a
linear-time DNF_SAT_SOLVER
. So the slow part of this
method must be CONVERT_TO_DNF(φ)
. And indeed,
our CONVERT_TO_DNF
routine may take exponential time.
This isn't surprising because many people suspect that there cannot
be a SAT solver whose worst-case runtime grows less than
exponentially as a function of |φ|. That's the "Satisfiability
Hypothesis," or SH for short.
In fact, CONVERT_TO_DNF
may take exponential time
simply to print its output! An example
where CONVERT_TO_DNF(φ)
blows up in size is
φ = (P1 ^ P2) v (Q1 ^ Q2) v (R1 ^ R2) v ...
for the same reason that the dual version of that formula blows up in the naive
version of CONVERT_TO_CNF(phi)
:
φ = (P1 v P2) ^ (Q1 v Q2) ^ (R1 ^ R2) v ...
In the CNF case, we were able to find a smarter version that keeps this formula compact by using switching variables. But we saw in lecture why the switching variable trick doesn't work for DNF (it has to do with the asymmetry between SAT and TAUT).
The lecture slides conclude with a different scheme called the Tseitin Transformation (Tseitin, 1968). This is much more efficient in the worst case (although it may give longer formulas in many common cases).
The basic idea is to associate a new variable with each subformula, and add 3-CNF clauses that ensure that each new variable is true iff its subformula is true. Finally, to require the whole formula to be true, add the unit clause X, where X is the variable associated with the whole formula.
This scheme ensures that the length of the final 3-CNF formula is proportional to the number of distinct subformulas in the original formula. There is no quadratic or exponential blowup. Furthermore, repeated subformulas do not require additional clauses.
You can find a textual description at Wikipedia.
http://cs.jhu.edu/~jason/tutorials/convert-to-CNF
Jason Eisner - jason@cs.jhu.edu (suggestions welcome) | Last Mod $Date: 2014/02/14 13:42:34 $ |