|
|

Available Now

Search

Contact
Feedback:
Send an e-mail with your
comments about this program (or anything else).

Help support DFF
If you benefit from the website, in terms of
knowledge, entertainment value, or something otherwise useful,
consider making a donation via PayPal to help defray the
costs. (No PayPal account necessary to donate via credit
card.) Transaction is secure.


|
| |
Problem Description
Recognizing strings based on a set of restrictions is a common computational problem.
A Slurpy is a string of characters that has certain properties. Your program will read in strings of characters and output
whether or not they are Slurpies.
A Slump is a character string that has the following properties:
-
Its first character is either a 'D' or an 'E'.
-
The first character is followed by a string of one or more 'F's.
-
The string of one or more 'F's is followed by either a Slump or a 'G'.
The Slump or 'G' that follows the F's ends the Slump. For example DFFEFFFG is a Slump since it has a 'D' for its first character,
followed by a string of two F's, and ended by the Slump 'EFFFG'.
-
Nothing else is a Slump.
A Slimp is a character string that has the following properties:
-
Its first character is an 'A'.
-
If it is a two character Slimp then its second and last character is an 'H'.
If it is not a two character Slimp then it is in one of these two forms:
-
'A' followed by 'B' followed by a Slimp followed by a 'C'.
-
'A' followed by a Slump (see above) followed by a 'C'.
-
Nothing else is a Slimp.
A Slurpy is a character string that consists of a Slimp followed by a
Slump.
Background & Techniques
This problem is from the 1996 Mid-Atlantic Regional ACM
Programming Contest. 56 college teams tried it and 35 solved is successfully.
Are you ready to join a team?
My first thought in selecting this problem
was that it would make a good filler program while I worked on something more
challenging. And it is a pretty good example of a
recursive definition, Slumps are defined in
terms of letters and Slumps. Slimps are made from letters, Slimps
and Slumps.
Then, while thinking about a more concise way to define the
problem, I spent a pleasant afternoon rediscovering Extended Backus-Naur Form
(EBNF) method of specifying the syntax (the elements and rules of a
language). Delphi, Pascal, any other
programming language as well as the "Slurp" language of our current
problem can be defined using EBNF.
It's pretty darn simple to be so powerful. Any EBNF language
definition consists of
Terminal symbols (sometimes displayed in bold faced
type or enclosed in double quotes). These are the "atoms" of the
language.
-
Non-terminal symbols,
frequently enclosed in angled brackets (< >) which are formed
from concatenations terminal symbols and other non-terminal
symbols
-
Rules which define the non-terminal
symbols. All of the non-terminal symbols are
defined by occurring on left side of a rule and are usually considered the name
of the rule.
-
Meta-symbols used in the rules defining the non-terminal
symbols. The
meta prefix means "about" so meta-symbols are those used
in talking about the language but are not part of the language. The
common meta-symbols in EBNF
are:
-
< > (brackets surrounding
a non-terminal name).
-
::= (is equivalent to,
"means").
-
|
(definition alternatives, "or").
-
[ ] (0 or 1
occurrences of enclosed symbols).
-
{ } (0 or more
occurrences of enclosed symbols). The definition of curly brackets
varies according to the implementer. Some references define curly brackets to mean one or
more occurrences. The ISO standard says that one or more occurrences
should be represented as { }-, a minus sign after the brackets.
. ; (rule
terminators - some way to indicate the end of a rule, commonly dot or
semicolon. I have also seen <EOL> used).
" (double
quotes, sometimes used around single character terminal symbols which might
be confused with meta-symbols).
Because of the variations, every document
using EBNF to define a language will usually define the conventions it is using. Mine will be standard except that
I'll put the terminal symbols in bold type to distinguish them
from non-terminal and meta-symbols. So our problem can be defined with these
three rules:
<Slump> ::= DF{F}G
| EF{F}G | DF{F}<Slump> | EF{F}<Slump>;
<Slimp> ::= AH | AB<Slimp>C
| A<Slump>C;
<Slurpy> ::= <Slimp><Slump>;
In case you're interested, here's a link
to the
final draft of the
ISO EBNF Standards document. I know little about the International
Standard Organization, except that they don't make their standards freely
available via the Web and they must use very expensive paper since they want 62
Swiss Francs ($40!) for the 12 page EBNF Standards document! Sounds
like a rip-off to me -- maybe it's not surprising that the standard is usually ignored.
Rant aside, EBNF is useful for compiler
writers since they provide straightforward language definitions - in fact there
are "compiler compilers" that can read any EBNF description set
and produce a compiler program based the the rules. YACC (Yet
Another Compiler Compiler) is the prototype for this and is widely distributed
in the Unix/Linux world. (Windows versions exist).
Running/Exploring the Program
Suggestions for Further Explorations
 |
Compiler development is an interesting area of
study. Compilers read source code and generate output programs that
run independently of the compiler. Interpreters perform the same function
except they execute the resulting code immediately. Maybe it would
be interesting to develop a mini-language and implement a Delphi
interpreter. |
 |
http://compilers.iecc.com/
is the home page of an active compilers newsgroup. Lots of
good browsing here including this link to an online
book entitled "Let's
Build a Compiler" by Dr. Jack Crenshaw. |
 |
Wonder if there exists a Delphi YACC?
We could start with a Slurp compiler that would read the rules and identify
arbitrary strings. |
Modified: July 21, 2006
|