Chapter 4. A Menu Generation Language

The previous chapter provided a simple example of an interpreter, a desktop calculator. In this chapter, we turn our attention to compiler design by developing a menu generation language (MGL) and its associated compiler. We begin with a description of the language that we are going to create. Then we look at several iterations of developing the lex and yacc specifications. Lastly, we create the actions associated with our grammar and which implement the features of the MGL.

Overview of the MGL

We’ll develop a language that can be used to generate custom menu interfaces. It will take as input a description file and produce a C program that can be compiled to create this output on a user’s terminal, using the standard curses library to draw the menus on the screen.[11]

In many cases when an application requires a lot of tedious and repetitive code, it is faster and easier to design a special purpose language and write a little compiler that translates your language into C or something else your computer can already handle. Curses programming is tedious, because you have to position all of the data on the screen yourself. MGL automates most of the layout, greatly easing the job.

The menu description consists of the following:

  1. A name for the menu screen

  2. A title or titles

  3. A list of menu items, each consisting of:

    item
    [ command ]
    action
    [ attribute ]

    where item is the text string that appears on the menu, command is the mnemonic used to provide command-line access to the functions of the menu system, action is the procedure that should be performed when a menu item is chosen, and attribute indicates how this item should be handled. The bracketed items are optional.

  4. A terminator

Since a useful application usually has several menus, a description file can contain several different named menus.

A sample menu description file is:

    screen myMenu
    title "My First Menu"
    title "by Tony Mason"

    item "List Things to Do"
    command "to-do"
    action execute list-things-todo
    attribute command

    item "Quit"
    command "quit"
    action quit

    end myMenu

The MGL compiler reads this description and produces C code, which must itself be compiled. When the resulting program is executed, it creates the following menu:

        My First Menu
        by Tony Mason

    1) List Things to Do
    2) Quit

When the user presses the “1” key or enters the command “to-do”, the procedure “list-things-todo” is executed.

A more general description of this format is:

            screen <name>
    title <string>

    item <string>
    [ command <string> ]
    action {execute | menu | quit | ignore} [<name>]
    [attribute {visible | invisible}]

    end <name>

As we develop this language, we will start with a subset of this functionality and add features to it until we implement the full specification. This approach shows you how easy it is to modify the lex-generated lexer and the yacc-generated parser as we change the language.

Developing the MGL

Let’s look at the design process that led to the grammar above. Menus provide a simple, clean interface for inexperienced users. For these users, the rigidity and ease of use provided by a menu system is ideal.

A major disadvantage of menus is that they keep experienced users from moving directly into the desired application. For these people, a command-driven interface is more desirable. However, all but the most experienced users occasionally want to fall back into the menu to access some seldom used function.

Our MGL should be designed with both of these design goals in mind. Initially, suppose we start with the keyword command, which indicates a menu choice or command that a user can issue to access some function.

This hardly constitutes a usable language. Nevertheless, we can sketch out a lexical specification for it:

    ws   [ \t]+
    nl   \n
    %%
    {ws} ;
    command   { return COMMAND; }
    {nl} { lineno++; }
    .    { return yytext[0];}

and its corresponding yacc grammar:

    %{
    #include <stdio.h>
    %}


    %token COMMAND
    %%
    start:     COMMAND
          ;

Our lexer merely looks for our keyword and returns the appropriate token when it recognizes one. If the parser sees the token COMMAND, then the start rule will be matched, and yyparse() will return successfully.

Each item on a menu should have an action associated with it. We can introduce the keyword action. One action might be to ignore the item (for unimplemented or unavailable commands), and another might be to execute a program; we can add the keywords ignore and execute.

Thus, a sample item using our modified vocabulary might be:

    command action execute

We must tell it what to execute, so we add our first noncommand argument, a string. Because program names can contain punctuation, we will presume that a program name is a quoted string. Now our sample item becomes:

    choice action execute "/bin/sh"

Example 4-1 demonstrates that we can modify our lex specification to support the new keywords, as well as the new token type.

Example 4-1. First version of MGL lexer

ws        [ \t]+
qstring   \"[^\"\n]*[\"\n]
nl        \n
%%
{ws}      ;
{qstring} { yylval.string = strdup(yytext+1); /* skip open quote */
            if(yylval.string[yyleng-2] != '"')
              warning("Unterminated character string",(char *)0);
          else
              yylval.string[yyleng-2] = '\0'; /* remove close quote */
            return QSTRING;
          }
action      { return ACTION; }
execute     { return EXECUTE; }

command     { return COMMAND;  }
ignore      { return IGNORE; }
{nl}        { lineno++; }
.            { return yytext[0]; }

Our complex definition of a qstring is necessary to prevent lex from matching a line like:

    "apples" and "oranges"

as a single token. The "[^\"\n]*” part of the pattern says to match against every character that is not a quotation mark or a newline. We do not want to match beyond a newline because a missing closing quotation mark could cause the lexical analyzer to plow through the rest of the file, which is certainly not what the user wants, and may overflow internal lex buffers and make the program fail. With this method, we can report the error condition to the user in a more polite manner. When we copy the string, we remove the opening and closing quotes, because it’s easier to handle the string without quotes in the rest of the code.

We also need to modify our yacc grammar (Example 4-2).

Example 4-2. First version of MGL parser

%{
#include <stdio.h>
%}

%union {
        char *string;       /* string buffer */
}

%token COMMAND ACTION IGNORE EXECUTE
%token <string> QSTRING
%%
start:    COMMAND action
      ;

action:   ACTION IGNORE
        | ACTION EXECUTE QSTRING
        ;
%%

We defined a %union including the “string” type and the new token QSTRING, which represents a quoted string.

We need to group information for a single command and choice combination together as a menu item. We introduce each new item as an item, using the keyword item. If there is an associated command, we indicate that using the keyword command. We add the new keyword to the lexer:

     . . .
    %%
     . . .
    item            { return ITEM; }

Although we have changed the fundamental structure of the language, there is little change in the lexical analyzer. The change shows up in the yacc grammar, as shown in Example 4-3.

Example 4-3. Grammar with items and actions

%{
#include <stdio.h>
%}


%union {
        char *string;        /* string pointer */
}

%token COMMAND ACTION IGNORE EXECUTE ITEM
%token <string> QSTRING
%%
item:   ITEM command action
      ;

command:    /* empty */
          | COMMAND
          ;

action:     ACTION IGNORE
          | ACTION EXECUTE QSTRING
          ;
%%

Since each menu item need not have a corresponding command, one of the command rules has an empty right-hand side. Surprisingly, yacc has no trouble with such rules, so long as the rest of the grammar makes it possible to tell unambiguously that the optional elements are not there.

We still have not given any meaning to the keyword command. Indeed, it is often a good idea to try writing the yacc grammar alone, because it can indicate “holes” in the language design. Fortunately, this is quickly remedied. We will restrict commands to strings of alphanumeric characters. We add an ID for “identifier” token to our lexical analyzer:

     . . .
    id        [a-zA-Z][a-zA-Z0-9]*
    %%
     . . .
    {id}      { yylval.string = strdup(yytext);
                return ID;
              }

The value of an ID is a pointer to the name of the identifier. In general, it is a poor idea to pass back a pointer to yytext as the value of a symbol, because as soon as the lexer reads the next token it may overwrite yytext. (Failure to copy yytext is a common lexer bug, often producing strange symptoms as strings and identifiers mysteriously seem to change their names.) We copy the token using strdup() and pass back a pointer to the copy. The rules that use an ID must be careful to free the copy when they are done with it.

In Example 4-4 we add the ID token to our yacc grammar.

Example 4-4. Grammar with command identifiers

%{
#include <stdio.h>
%}

%union {
       char *string;         /* string buffer */
}

%token COMMAND ACTION IGNORE EXECUTE ITEM
%token <string> QSTRING ID
%%
item:    ITEM command action
         ;

command:  /* empty */
        | COMMAND ID
        ;


action:   ACTION IGNORE
        | ACTION EXECUTE QSTRING
        ;
%%

The grammar does not provide for more than a single line in a menu. We add some rules for items that support multiple items:

     . . .
    %%
    items:  /* empty */
          | items item
          ;

    item:   ITEM command action
           ;
     . . .

Unlike all our previous rules, these rely upon recursion. Because yacc prefers left-recursive grammars, we wrote “items item” rather than the right-recursive version “item items.” (See the section "Recursive Rules" in Chapter 7 for why left-recursion is better.)

One of the rules for items has an empty right-hand side. For any recursive rule, there must be a terminating condition, a rule that matches the same non-terminal non-recursively. If a rule is purely recursive with no non-recursive alternative, most versions of yacc will halt with a fatal error since it is impossible to construct a valid parser under those circumstances. We will use left-recursion many times in many grammars.

In addition to being able to specify items within the menu, you may want to have a title at the top of the menu. Here is a grammar rule that describes a title:

    title:       TITLE       QSTRING
            ;

The keyword title introduces a title. We require that the title be a quoted string. We add the new token TITLE to our lex specification:

     . . .
    %%
    title        { return TITLE;  }
     . . .

We might want more than a single title line. Our addition to the grammar is:

    titles:   /* empty */
            | titles title
            ;

    title:       TITLE       QSTRING
            ;

A recursive definition allows multiple title lines.

The addition of title lines does imply that we must add a new, higher-level rule to consist of either items or titles. Titles will come before items, so Example 4-5 adds a new rule, start, to our grammar.

Example 4-5. Grammar with titles

%{
#include <stdio.h>
%}

%union {
        char *string;       /* string buffer */
}

%token COMMAND ACTION IGNORE EXECUTE ITEM TITLE
%token <string> QSTRING ID
%%
start: titles items
       ;

titles:  /* empty */
       | titles title
       ;

title:       TITLE       QSTRING
       ;

items:  /* empty */
      | items item
      ;

item:    ITEM command action
       ;

command:  /* empty */
        | COMMAND ID
        ;

action:  ACTION IGNORE
       | ACTION EXECUTE QSTRING
       ;
%%

After we’d used the MGL a little bit, we found that one menu screen wasn’t enough. We wanted multiple screens and the ability to refer to one screen from another to allow multi-level menus.

We defined a new rule screen, to contains a complete menu with both titles and items. To add the handling of multiple screens, we can, once again, use a recursive rule to build a new screens rule. We wanted to allow for empty screens, so we added a total of five new sets of rules:

    screens:  /* empty */
            | screens screen
            ;

    screen:  screen_name screen_contents screen_terminator
           | screen_name screen_terminator
           ;

    screen_name:  SCREEN ID
                | SCREEN
                ;

    screen_terminator:  END ID
                      | END
                      ;

    screen_contents: titles lines

We provide each screen with a unique name. When we wish to reference a particular menu screen, say, “first,” we can use a line such as:

    item "first" command first action menu first

When we name screens, we must also indicate when a screen ends, so we need a screen_terminator rule. Thus, a sample screen specification might be something like this:

    screen main
    title "Main screen"
    item "fruits" command fruits action menu fruits
    item "grains" command grains action menu grains
    item "quit" command quit action quit
    end main


    screen fruits
    title "Fruits"
    item "grape" command grape action execute "/fruit/grape"
    item "melon" command melon action execute "/fruit/melon"
    item "main" command main action menu main
    end fruits


    screen grains
    title "Grains"
    item "wheat" command wheat action execute "/grain/wheat"
    item "barley" command barley action execute "/grain/barley"
    item "main" command main action menu main
    end grains

Our rule does provide for the case when no name is given; hence, the two cases for screen_name and screen_terminator. When we actually write actions for the specific rules, we will check that the names are consistent, to detect inconsistent editing of a menu description buffer, for instance.

After some thought, we decided to add one more feature to our menu generation language. For each item, we wish to indicate if it is visible or invisible. Since of this is an attribute of the individual item, we precede the choice of visible or invisible with the new keyword attribute. Here is the portion of our new grammar that describes an attribute:

    attribute: /* empty */
             | ATTRIBUTE VISIBLE
             | ATTRIBUTE INVISIBLE
             ;

We allow the attribute field to be empty to accept a default, probably visible. Example 4-6 is our workable grammar.

Example 4-6. Complete MGL grammar

screens:   /* empty */
         | screens screen
         ;

screen:    screen_name screen_contents screen_terminator
         | screen_name screen_terminator
         ;

screen_name:  SCREEN ID
            | SCREEN
            ;

screen_terminator:   END ID
                   | END
                   ;

screen_contents: titles lines
                 ;

titles:   /* empty */
        | titles title
        ;

title: TITLE QSTRING
          ;

lines: line
       | lines line
       ;
line:  ITEM QSTRING command ACTION action attribute
       ;


command:   /* empty */
         | COMMAND ID
         ;

action:    EXECUTE QSTRING
         | MENU ID
         | QUIT
         | IGNORE
         ;

attribute:   /* empty */
           | ATTRIBUTE VISIBLE
           | ATTRIBUTE INVISIBLE
           ;

We have replaced the start rule of previous examples with the screens rule as the top-level rule. If there is no %start line in the declarations section, it simply uses the first rule.

Building the MGL

Now that we have a basic grammar, the work of building the compiler begins. First, we must finish modifying our lexical analyzer to cope with the new keywords we introduced in our last round of grammar changes. Our modified lex specification is shown in Example 4-7.

Example 4-7. MGL lex specification

ws        [ \t]+
comment   #.*
qstring   \"[^\"\n]*[\"\n]
id        [a-zA-Z][a-zA-Z0-9]*
nl        \n

%%

{ws}      ;
{comment} ;
{qstring} { yylval.string = strdup(yytext+1);
            if (yylval.string[yyleng-2] != '"')
              warning("Unterminated character string",(char *)0);
          else
               yylval.string[yyleng-2] = '\0'; /* remove close quote */
            return QSTRING;
          }
screen    { return SCREEN; }
title     { return TITLE; }
item      { return ITEM; }
command   { return COMMAND; }
action    { return ACTION; }
execute   { return EXECUTE; }
menu      { return MENU; }
quit      { return QUIT; }
ignore    { return IGNORE; }
attribute { return ATTRIBUTE; }
visible   { return VISIBLE; }
invisible { return INVISIBLE; }
end       { return END; }
{id}      { yylval.string = strdup(yytext);
            return ID;
          }
{n1}      { lineno++; }
.         { return yytext[0]; }
%%

An alternative way to handle keywords is demonstrated in Example 4-8.

Example 4-8. Alternative lex specification

 . . .
id            [a-zA-Z][a-zA-Z0-9]*
%%
 . . .
{id}      { if (yylval.cmd = keyword(yytext)) returnyylval.cmd;
                yylval.string = strdup(yytext);
                return ID;
               }
%%
/*
 * keyword: Take a text string and determine if it is, in fact,
 * a valid keyword.  If it is, return the value of the keyword;
 * if not, return zero.  N.B.: The token values must be nonzero.
 */

static struct keyword {
char *name;    /* text string */
int value;     /* token */
} keywords[] =
{
"screen",          SCREEN,
"title",           TITLE,
"item",            ITEM,
"command",         COMMAND,
"action",          ACTION,
"execute",         EXECUTE,
"menu",            MENU,
"quit",            QUIT,
"ignore"           IGNORE,
"attribute",       ATTRIBUTE,
"visible".         VISIBLE,
"invisible",       INVISIBLE,
"end".             END,
NULL, 0,
};

int keyword(string)
   char *string;
  {
   struct keyword *ptr = keywords;

   while(ptr->name != NULL)
      if(strcmp(ptr->name,string) == 0)
        {
  return ptr->value;
}
   else
       ptr++;
   return 0; /* no match */
}

The alternate implementation in Example 4-8 uses a static table to identify keywords. The alternative version is invariably slower, since a lex lexer’s speed is independent of the number or complexity of the patterns. We chose to include it here simply because it demonstrates a useful technique we could use if we wished to make the language’s vocabulary extensible. In that case, we would use a single lookup mechanism for all keywords, and add new keywords to the table as needed.

Logically, we can divide the work in processing an MGL specification file into several parts:

Initialization

Initialize all internal data tables, emit any preambles needed for the generated code.

Start-of-screen processing

Set up a new screen table entry, add the name of the screen to the name list, and emit the initial screen code.

Screen processing

As we encounter each individual item, we deal with it; when we see title lines, we add them to the title list, and when we see new menu items, we add them to the item list.

End-of-screen processing

When we see the end statement, we process the data structures we have built while reading the screen description and emit code for the screen.

Termination

We “clean up” the internal state, emit any final code, and assure that this termination is OK; if there is a problem, we report it to the user.

Initialization

A certain amount of work must be performed when any compiler begins operation. For instance, internal data structures must be initialized; recall that Example 4-8 uses a keyword lookup scheme rather than the hardcoded keyword recognition scheme used earlier in Example 4-7. In a more complex application with a symbol table as part of initialization we would insert the keywords into the symbol table as we did in Example 3-10.

Our main() routine starts out simply:

    main()
    {
        yyparse();
    }

We must also be able to invoke our compiler by giving it a filename. Because lex reads from yyin and writes to yyout, which are assigned to the standard input and standard output by default, we can reattach the input and output files to obtain the appropriate action. To change the input or output, we open the desired files using fopen() from the standard I/O library and assign the result to yyin or yyout.

If the user invokes the program with no arguments, we write out to a default file, screen.out, and read from the standard input, stdin. If the user invokes the program with one argument, we still write to screen.out and use the named file as the input. If the user invokes the program with two arguments, we use the first arguemnt as the input file and the second argument as the output file.

After we return from yyparse(), we perform post-processing and then check to assure we are terminating with no error condition. We then clean up and exit.

Example 4-9 shows our resulting main() routine.

Example 4-9. MGL main() routine

char *progname = "mgl";
int lineno = 1;

#define DEFAULT_OUTFILE "screen.out"

char *usage = "%s: usage [infile] [outfile]\n";

main(int argc, char **argv)
{

      char *outfile;
      char *infile;
      extern FILE *yyin, *yyout;

      progname = argv[0];

      if(argc > 3)
      {
            fprintf(stderr,usage, progname);
            exit(1);
      }
      if(argc > 1)
      {
            infile = argv[1];
            /* open for read */
            yyin = fopen(infile,"r");
            if(yyin == NULL) /* open failed */
            {
                  fprintf(stderr,"%s: cannot open %s\n",
                        progname, infile);
                  exit(1);
             }
      }

      if(argc > 2)
      {
            outfile = argv[2];
      }
      else
      {
                  outfile = DEFAULT_OUTFILE;
      }

      yyout = fopen(outfile,"w");
      if(yyout == NULL) /* open failed */
      {
                  fprintf(stderr,"%s: cannot open %s\n",
                  progname, outfile);
            exit(1);
      }
      /* normal interaction on yyin and
         yyout from now on */

      yyparse();

      end_file(); /* write out any final information */

      /* now check EOF condition */
      if(!screen_done) /* in the middle of a screen */
      {
            warning("Premature EOF",(char *)0);
            unlink(outfile); /* remove bad file */
            exit(1);
      }
      exit(0); /* no error */
}

warning(char *s, char *t) /* print warning message */
{
      fprintf(stderr, "%s: %s", progname, s);
      if (t)
            fprintf(stderr, " %s", t);
      fprintf(stderr, " line %d\n", lineno);
}

Screen Processing

Once we have initialized the compiler and opened the files, we turn our attention to the real work of the menu generator—processing the menu descriptions. Our first rule, screens, requires no actions. Our screen rule decomposes into the parts screen_name, screen_contents, and screen_terminator. screen_name interests us first:

    screen_name:  SCREEN ID
                | SCREEN
                ;

We insert the specified name into our list of names duplicates; in case no name is specified, we will use the name “default.” Our rule becomes:

    screen_name:  SCREEN ID  { start_screen($2); }
                | SCREEN     { start_screen(strdup("default")); }
                ;

(We need the call to strdup() to be consistent with the first rule which passes a string dynamically allocated by the lexer.) The start_screen routine enters the name into our list of screens and begin to generate the code.

For instance, if our input file said “screen first”, the start_screen routine would produce the following code:

    /* screen first */
    menu_first()
    {
            extern struct item menu_first_items[];

            if(!init) menu_init();

            clear();
            refresh();

When processing our menu specification, the next object is a title line:

    title: TITLE QSTRING
             ;

We call add_title(), which computes the positioning for the title line:

    title: TITLE QSTRING { add_title($2); }
             ;

Our sample output for title lines looks like this:

        move(0,37);
        addstr("First");
        refresh();

We add a title line by positioning the cursor and printing out the given quoted string (with some rudimentary centering as well). This code can be repeated for each title line we encounter; the only change is to the line number used to generate the move() call.

To demonstrate this, we make a menu description with an extra title line:

    screen first
    title "First"
    title "Copyright 1992"

    item "first" command first action ignore
          attribute visible
    item "second" command second action execute "/bin/sh"
          attribute visible
    end first

    screen second
    title "Second"
    item "second" command second action menu first
          attribute visible
    item "first" command first action quit
          attribute invisible
    end second

Our sample output for title lines is:

    move(0,37);
    addstr("First");
    refresh();
    move(1,32);
    addstr("Copyright 1992");
    refresh();

Once we see a list of item lines, we take the individual entries and build an internal table of the associated actions. This continues until we see the statement end first when we perform post-processing and finish building the screen. To build a table of menu items, we add the following actions to the item rule:

    line: ITEM qstring command ACTION action attribute
             {
             item_str = $2;
             add_line($5, $6);
               $$ = ITEM;
              }
             ;

The rules for command, action, and attribute primarily store the token values for later use:

    command: /* empty */ { cmd_str = strdup(""); }
           | COMMAND id { cmd_str = $2; }
            ;

A command can be null or a specific command. In the first case we save a null string for the command name (using strdup() to be consistent with the next rule), and in the second we save the identifier given for the command in cmd_str.

The action rules and the associated actions are more complex, partially because of the larger number of variations possible:

    action: EXECUTE qstring
             { act_str = $2;
             $$ = EXECUTE;
             }
          | MENU id
             { /* make "menu_" $2 */
             act_str = malloc(strlen($2) + 6);
             strcpy(act_str,"menu_");
               strcat(act_str, $2);
             free($2);
             $$ = MENU;
            }
           | QUIT   { $$ = QUIT; }
           | IGNORE { $$ = IGNORE; }
           ;

Finally, the attribute rule is simpler, as the only semantic value is represented by the token:

    attribute:  /* empty */             { $$ = VISIBLE; }
              | ATTRIBUTE VISIBLE       { $$ = VISIBLE; }
              | ATTRIBUTE INVISIBLE    { $$ = INVISIBLE; }
              ;

The return values of the action rule and the attribute rule were passed to the add_line routine; this call takes the contents of the various static string pointers, plus the two return values, and creates an entry in the internal state table.

Upon seeing the end first statement, we must the final processing of the screen. From our sample output, we finish the menu_first routine:

        menu_runtime(menu_first_items);
    }

The actual menu items are written into the array menu_first_items:

    /* end first */
    struct item menu_first_items [] = {
          {"first","first",271,"",0,273},
          {"second","second",267,"/bin/sh",0,273},
          {(char *)0, (char *)0, 0, (char *)0, 0, 0},
    };

The run-time routine menu_runtime will display the individual items; this will be included in the generated file as part of the code at the end.

Termination

The final stage of dealing with a single screen is to see the termination of that screen. Recall our screen rule:

    screen:   screen_name screen_contents screen_terminator
            | screen_name screen_terminator
            ;

The grammar expects to see the screen_terminator rule:

    screen_terminator:   END ID
                      |  END
                      ;

We add a call to our post-processing routine for end-of-screen post-processing (not the end-of-file post-processing which we will discuss later in this section). The resulting rule is:

    screen_terminator: END id { end_screen($2); }
                     | END { end_screen(strdup("default")); }
                     ;

It calls the routine end_screen with the name of the screen or “default” if no name is provided. This routine validates the screen name. Example 4-10 shows the code that implements it.

Example 4-10. Screen end code

/*
 * end_screen:
 * Finish screen, print out postamble.
 */

end_screen(char *name)
{

      fprintf(yyout, "menu_runtime(menu_%s_items);\n",name);

      if(strcmp(current_screen,name) != 0)
      {
            warning("name mismatch at end of screen",
                current_screen);
      }
      fprintf(yyout, "}\n");
      fprintf(yyout, "/* end %s */\n",current_screen);

      process_items();

      /* write initialization code out to file */
      if(!done_end_init)
      {
            done_end_init = 1;
            dump_data(menu_init);
      }

      current_screen[0]  =  '\0';         /* no current screen */

      screen_done = 1;

      return 0;
}

This routine handles a screen name mismatch as a nonfatal error. Since the mismatch does not cause problems within the compiler, it can be reported to the user without termination.

This routine processes the data generated by our add_item() call by processing individual item entries with process_items(). Then it calls dump_data to write out some initialization routines; these initialization routines are really a static array of strings that are built into the MGL compiler. We call dump_data() in several places to dump different code fragments to the output file. An alternative approach is to simply copy these code fragments from a “skeleton” file containing the boiler-plate code, as some versions of lex and yacc do.

Post-processing occurs after all input has been read and parsed. This is done by the main() routine by calling end_file() after yyparse() has completed successfully. Our implementation is:

    /*
     * this routine writes out the run-time support
     */

    end_file()
    {

          dump_data(menu_runtime);
    }

This routine contains a single call to dump_data() to write the runtime routines, which are stored in a static array in the compiler as the initialization code was. All our routines that handle the boiler-plate code are sufficiently modular in design that these could be rewritten to use skeleton files.

Once this routine has been called, the main routine terminates by checking to determine if the end of input was detected at a valid point, i.e., at a screen boundary, and is not generating an error message.

Sample MGL Code

Now that we have built a simple compiler, let’s demonstrate the basic workings of it. Our implementation of the MGL consists of three parts: the yacc grammar, the lex specification, and our supporting code routines. The final version of our yacc grammar, lex lexer, and supporting code is shown in Appendix I, MGL Compiler Code .

We do not consider the sample code to be robust enough for serious use; our attention was given to developing a first-stage implementation. The resulting compiler will, however, generate a fully functional menu compiler. Here is our sample input file:

    screen first
    title "First"

    item "dummy line" command dummy action ignore
          attribute visible
    item "run shell" command shell action execute "/bin/sh"
          attribute visible

    end first

    screen second

    title "Second"

    item "exit program" command exit action quit
          attribute invisible
    item "other menu" command first action menu first
          attribute visible

    end second

When that description file was processed by our compiler, we got the following output file:

    /*
     * Generated by MGL: Thu Aug 27 18:03:33 1992
     */

    /* initialization information */
    static int init;

    #include <curses.h>
    #include <sys/signal.h>
    #include <ctype.h>
    #include "mglyac.h"

    /* structure used to store menu items */
    struct item {
          char *desc;
          char *cmd;
          int action;
          char  *act_str;    /* execute string */
          int (*act_menu)();  /* call appropriate function */
          int attribute;
    }

    /* screen first */
    menu_first()
    {
          extern struct item menu_first_items[];
          if(!init) menu_init();

          clear();
          refresh();
          move(0,37);
          addstr("First");
          refresh();
          menu_runtime(menu_first_items);
    }
    /* end first */
    struct item menu_first_items [] = {
    {"dummy line","dummy",269,"",0,271},
    {"run shell","shell",265,"/bin/sh",0,271},
    {(char *)0, (char *)0, 0, (char *)0, 0, 0},
    };

    menu_init()
    {
          void menu_cleanup();

          signal(SIGINT, menu_cleanup);
          initscr();
          crmode();
    }


    menu_cleanup()
    {
          mvcur(0, COLS − 1, LINES − 1, 0);
          endwin();
    }

    /* screen second */
    menu_second()
    {
          extern struct item menu_second_items[];

          if(!init) menu_init();

          clear();
          refresh();
          move(0,37);
          addstr("Second");
          refresh();
          menu_runtime(menu_second_items);
    }
    /* end second */
    struct item menu_second_items[ ] = {
    {"exit program","exit",268,"",0,272},
    {"other menu","first",267,"",menu_first,271},
    {(char *)0, (char *)0, 0, (char *)0, 0, 0},
    };

    /* runtime */
    menu_runtime(items)
    struct item *items;
    {
          int visible = 0;
          int choice = 0;
          struct item *ptr;
          char buf[BUFSIZ];

          for(ptr = items; ptr->desc != 0; ptr++) {
                addch('\n'); /* skip a line */
                if(ptr->attribute == VISIBLE) {
                      visible++;
                      printw("\t%d) %s",visible,ptr->desc);
                }
          }

          addstr("\n\n\t"); /* tab out so it looks nice */
          refresh();

          for(;;)
          {
                int i, nval;

                getstr(buf);

                /* numeric choice? */
                nval = atoi(buf);

                /* command choice ? */
                i = 0;
                for(ptr = items; ptr->desc != 0; ptr++) {
                      if(ptr->attribute != VISIBLE)
                            continue;
                      i++;
                      if(nval == i)
                            break;
                      if(!casecmp(buf, ptr->cmd))
                            break;
                }
                if(!ptr->desc)
                      continue;  /* no match */

                switch(ptr->action)
                {
                case QUIT:
                     return 0;
                case IGNORE:
                      refresh();
                      break;
                case EXECUTE:
                      refresh();
                      system(ptr->act_str);
                      break;
                case MENU:
                      refresh();
                      (*ptr->act_menu)();
                      break;
                default:
                      printw("default case,  no action\n");
                      refresh();
                      break;
                }
                refresh();
          }
    }

    casecmp(char *p,  char *q)
    {
          int pc,  qc;

          for(;  *p   !=  0;  p++,   q++)   {
                 pc = tolower(*p);
                 qc = tolower(*q);

                 if(pc  != qc)
                        break;
          }
          return pc-qc;
    }

In turn, we compiled this code, generated by our compiler and written to first.c, with the following command:

    $ cat >> first.c
    main()
    {
    menu_second();
    menu_cleanup();
    }
    ^D
    $ cc -o first first.c -lcurses -ltermcap
    $

We had to add a main() routine to the generated code; in a revision of the MGL, it might be desirable to include a command-line option or a specification option to provide the name of a routine to call from within the main loop; this a typical possible enhancement. Because we wrote our grammar in yacc, this modification would be easy. For example, we might modify the screens rule to read:

    screens:   /* nothing */
             | preamble screens screen
             | screens screen
             ;
    preamble:  START ID
            |  START DEFAULT
            ;

where we add appropriate keywords for START and DEFAULT.

Running our MGL-generated screen code, we see with the following menu screen:

                              Second
    1) other menu

We see the one visible menu entry, and when we enter “1” or “first” move to the first menu:

                              First
    1)   dummy line
    2)   run shell

Exercises

  1. Add a command to identify the main screen and generate a main routine, as outlined previously.

  2. Improve the screen handling: read characters in CBREAK mode rather than a line at a time, allow more flexible command handling, e.g., accept unique prefixes of commands rather than requiring the whole command, allow application code to set and clear the invisible attribute, etc.

  3. Extend the screen definition language to let titles and command names come from variables in the host program, e.g.:

        screen sample
        title $titlevar
    
        item $label1 command $cmd1 action ignore
            attribute visible
    
        end sample

    where titlevar and label1 are character arrays or pointers in the host program.

  4. (Term project) Design a language to specify pulldown or pop-up menus. Implement several translators based on the same parser so you can use the same menu specification to create menus that run in different environments, e.g., a terminal with curses, Motif, and Open Look.

  5. Yacc is often used to implement “little languages” that are specific to an application area and translate into lower-level, more general languages. The MGL turns a menu specification into C, eqn turns an equation language into raw troff. What are some other application areas that would benefit from a little language? Design and implement a few of them with lex and yacc.



[11] For more information on curses, see Programming with Curses by John Strang, published by O’Reilly & Associates.

Get lex & yacc, 2nd Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.