Skip to content

Bnf Ebnf Homework Help

Pascal EBNF Definition

This document is an Extended Backus-Naur Form definition of the Pascal programming language.

Notation: Square brackets delimit optional constructs; braces indicate zero or more repetitions of the enclosed construct; parantheses indicate simple grouping of constructs; a vertical bar indicates choice of one from many; literal text in definitions is denoted using bold font or double quotation marks .

Credits: Copyright Stephen Fitzpatrick, Department of Computer Science, The Queen's University of Belfast, Northern Ireland. This document may be used for non-profit purposes only; no part of this document may be duplicated unless this copyright notice is included.

References: I hope this definition is accurate (and would appreciate any corrections), but it cannot be taken as being definitive. For more information, see the ISO standard `ISO 7185:1990' or a good text book such as "Introduction to Pascal" by Jim Welsh and John Elder, Prentice Hall, International Series in Computer Science, ISBN 0-13-491549-6.

Last major revision: October 1995
Last update: August 1996


Main Categories of Syntactic Classes

Programs and Blocks, Procedure and Function Definitions, Statements, Expressions, Types, Record Fields, Input/Output, Variable and Identifier Categories, Low Level Definitions

Alphabetical List of Syntactic Classes

  • actual-function, actual-parameter, actual-parameter-list, actual-procedure, actual-value, actual-variable, addition-operator, array-type, array-variable, assignment-statement
  • base-type, block, bound-identifier, bound-specification
  • case-label-list, case-limb, case-statement, component-variable, compound-statement, conditional-statement, conformant-array-schema, constant, constant-definition, constant-definition-part, constant-identifier
  • declaration-part, digit, digit-sequence, directive
  • element-list, element-type, entire-variable, enumerated-type, expression, expression-list
  • factor, field-designator, field-identifier, field-list, field-width, file-buffer, file-component-type, file-type, file-variable, final-expression, fixed-part, for-statement, formal-parameter-list, formal-parameter-section, fraction-length, function-body, function-declaration, function-designator, function-heading, function-identification, function-identifier, function-parameter-section
  • goto-statement
  • identifier, identifier-list, if-statement, index-type, indexed-variable, initial-expression, integer-number
  • label, label-declaration-part, letter, lower-bound
  • multiplication-operator
  • number
  • ordinal-type-identifier, output-list, output-value
  • packed-conformant-array-schema, parameter-type, pointer-type, pointer-variable, procedure-and-function-declaration-part, procedure-body, procedure-declaration, procedure-heading, procedure-identification, procedure-identifier, procedure-parameter-section, procedure-statement, program, program-heading
  • real-number, record-section, record-type, record-variable, referenced-variable, relational-operator, repeat-statement, repetitive-statement, result-type
  • scale-factor, set, set-type, sign, simple-expression, simple-statement, simple-type, statement, statement-part, statement-sequence, string, string-character, structured-statement, structured-type, subrange-type
  • tag-field, term, type, type-definition, type-definition-part, type-identifier
  • unpacked-conformant-array-schema, unpacked-structured-type, unsigned-digit-sequence, upper-bound
  • value-parameter-section, variable, variable-declaration, variable-declaration-part, variable-identifier, variable-list, variable-parameter-section, variant, variant-part
  • while-statement, with-statement

Definitions of Syntactic Classes

Programs and Blocks

program
program-headingblock "."
program-heading
programidentifier "(" identifier-list ")" ";"
block
declaration-partstatement-part
declaration-part
[ label-declaration-part ]
[ constant-definition-part ]
[ type-definition-part ]
[ variable-declaration-part ]
procedure-and-function-declaration-part
label-declaration-part
labellabel { "," label } ";"
constant-definition-part
constconstant-definition ";" { constant-definition ";" }
constant-definition
identifier "=" constant
type-definition-part
typetype-definition ";" { type-definition ";" }
type-definition
identifier "=" type
variable-declaration-part
varvariable-declaration ";" { variable-declaration ";" }
variable-declaration
identifier-list ":" type
procedure-and-function-declaration-part
{ (procedure-declaration | function-declaration) ";" }
procedure-declaration
procedure-heading ";" procedure-body |
procedure-heading ";" directive |
procedure-identification ";" procedure-body
procedure-body
block
function-declaration
function-heading ";" function-body |
function-heading ";" directive |
function-identification ";" function-body
function-body
block
directive
forward | compiler-defined-directives
statement-part
beginstatement-sequenceend

Procedure and Function Definitions

procedure-heading
procedureidentifier [ formal-parameter-list ]
function-heading
functionidentifier [ formal-parameter-list ] ":" result-type
result-type
type-identifier
procedure-identification
procedureprocedure-identifier
function-identification
functionfunction-identifier
formal-parameter-list
"(" formal-parameter-section { ";" formal-parameter-section } ")"
formal-parameter-section
value-parameter-section |
variable-parameter-section |
procedure-parameter-section |
function-parameter-section
value-parameter-section
identifier-list ":" parameter-type
variable-parameter-section
varidentifier-list ":" parameter-type
procedure-parameter-section
procedure-heading
function-parameter-section
function-heading
parameter-type
type-identifier | conformant-array-schema
conformant-array-schema
packed-conformant-array-schema |
unpacked-conformant-array-schema
packed-conformant-array-schema
packedarray "[ " bound-specification " ]" of type-idnetifier
unpacked-conformant-array-schema
array "[ " bound-specification { ";" bound-specification } " ]"
of (type-identifier | conformant-array-schema)
bound-specification
identifier ".." identifier ":" ordinal-type-identifier
ordinal-type-identifier
type-identifier

Statements

statement-sequence
statement { ";" statement }
statement
[ label ":" ] (simple-statement | structured-statement)
simple-statement
[ assignment-statement | procedure-statement | goto-statement ]
assignment-statement
(variable | function-identifier) ":=" expression
procedure-statement
procedure-identifier [ actual-parameter-list ]
goto-statement
gotolabel
structured-statement
compound-statement | repetitive-statement | conditional-statement | with-statement
compound-statement
beginstatement-sequenceend
repetitive-statement
while-statement | repeat-statement | for-statement
while-statement
whileexpressiondostatement
repeat-statement
repeatstatement-sequenceuntilexpression
for-statement
forvariable-identifier ":=" initial-expression (to | downto) final-expressiondostatement
initial-expression
expression
final-expression
expression
conditional-statement
if-statement | case-statement
if-statement
ifexpressionthenstatement [ elsestatement ]
case-statement
caseexpressionof
case-limb { ";" case-limb } [ ";" ]
end
case-limb
case-label-list ":" statement
case-label-list
constant { "," constant }
with-statement
withrecord-variable { "," record-variable } dostatement
actual-parameter-list
"(" actual-parameter { "," actual-parameter } ")"
actual-parameter
actual-value | actual-variable | actual-procedure | actual-function
actual-value
expression
actual-procedure
procedure-identifier
actual-function
function-identifier

Expressions

expression
simple-expression [ relational-operatorsimple-expression ]
simple-expression
[ sign ] term { addition-operatorterm }
term
factor { multiplication-operatorfactor }
factor
variable | number | string | set | nil | constant-identifier | bound-identifier | function-designator | "(" expression ")" | notfactor
relational-operator
"=" | "<>" | "<" | "<=" | ">" | ">=" | "in"
addition-operator
"+" | "-" | or
multiplication-operator
"*" | "/" | div | mod | and
variable
entire-variable | component-variable | referenced-variable
entire-variable
variable-identifier | field-identifier
component-variable
indexed-variable | field-designator | file-buffer
indexed-variable
array-variable "[ " expression-list " ]"
field-designator
record-variable "." field-identifier
set
"[ " element-list " ]"
element-list
[ expression { "," expression } ]
function-designator
function-identifier [ actual-parameter-list ]
file-buffer
file-variable "^"

Types

type
simple-type | structured-type | pointer-type | type-identifier
simple-type
subrange-type | enumerated-type
enumerated-type
"(" identifier-list ")"
subrange-type
lower-bound ".." upper-bound
lower-bound
constant
upper-bound
constant
structured-type
[ packed ] unpacked-structured-type
unpacked-structured-type
array-type | record-type | set-type | file-type
array-type
array "[ " index-type { "," index-type } " ]" ofelement-type
index-type
simple-type
element-type
type
record-type
recordfield-listend
set-type
setofbase-type
base-type
type
file-type
fileoffile-component-type
file-component-type
type
pointer-type
"^" type-identifier

Record Fields

field-list
[ (fixed-part [ ";" variant-part ] | variant-part) [ ";" ] ]
fixed-part
record-section { ";" record-section }
record-section
identifier-list ":" type
variant-part
casetag-fieldtype-identifierofvariant { ";" variant }
tag-field
[ identifier ":" ]
variant
case-label-list ":" "(" field-list ")"

Input/Output

output-list
output-value { "," output-value }
output-value
expression [ ";" field-width [ ":" fraction-length ] ]
field-width
expression
fraction-length
expression

Variable and Identifier Categories

identifier
letter { letter | digit }
file-variable
variable
referenced-variable
pointer-variable "^"
record-variable
variable
pointer-variable
variable
actual-variable
variable
array-variable
variable
field-identifier
identifier
constant-identifier
identifier
variable-identifier
identifier
type-identifier
identifier
procedure-identifier
identifier
function-identifier
identifier
bound-identifier
identifier

Low Level Definitions

variable-list
variable { "," variable }
identifier-list
identifier { "," identifier }
expression-list
expression { "," expression }
number
integer-number | real-number
integer-number
digit-sequence
real-number
digit-sequence "." [ digit-sequence ] [ scale-factor ] |
digit-sequencescale-factor
scale-factor
("E" | "e") [ sign ] digit-sequence
unsigned-digit-sequence
digit { digit }
digit-sequence
[ sign ] unsigned-digit-sequence
sign
"+" | "-"
letter
"A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z" | "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | "y" | "z"
digit
"0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
string
"'" string-character { string-character } "'"
string-character
any-character-except-quote | "''"
label
integer-number
constant
[ sign ] (constant-identifier | number) | string

Copyright: Dr S. Fitzpatrick (email S.Fitzpatrick@qub.ac.uk),
Department of Computer Science,
The Queen's University of Belfast.

Defining a language

A grammar defines a language.

In computer science, the most common type of grammar is the context-free grammar, and these grammars will be the primary focus of this article.

Context-free grammars have sufficient richness to describe the recursive syntactic structure of many (though certainly not all) languages.

I'll discuss grammars beyond context-free at the end.

Components of a context-free grammar

A set of rules is the core component of a grammar.

Each rule has two parts: (1) a name and (2) an expansion of the name.

For instance, if we were creating a grammar to handle english text, we might add a rule like:

noun-phrase may expand into articlenoun.

from which we could ultimately deduce that "the dog" is a noun-phrase.

Or, if we were describing a programming language, we could add a rule like:

expression may expand into expressionexpression

If we're working with grammars as mathematical objects, then instead of writing "may expand into," we'd simply write $\rightarrow$:

noun-phrase $\rightarrow$ articlenoun
expression $\rightarrow$ expressionexpression

As an example, consider the classic unambiguous expression grammar:

\[ \mathit{expr} \rightarrow \mathit{term}\; \mathtt{+}\; \mathit{expr} \\ \mathit{expr} \rightarrow \mathit{term} \\ \mathit{term} \rightarrow \mathit{term}\; \mathtt{*}\; \mathit{factor} \\ \mathit{term} \rightarrow \mathit{factor} \\ \mathit{factor} \rightarrow \mathtt{(}\;\mathit{expr}\;\mathtt{)} \\ \mathit{factor} \rightarrow \mathit{const} \\ \mathit{const} \rightarrow \mathit{integer} \]

So, how do we know that is a valid expression?

Because:

expr may expand into term;
which may expand into termfactor;
which may expand into factorfactor;
which may expand into constfactor;
which may expand into constconst;
which may expand into const;
which may expand into .

Backus-Naur Form (BNF) notation

When describing languages, Backus-Naur form (BNF) is a formal notation for encoding grammars intended for human consumption.

Many programming languages, protocols or formats have a BNF description in their specification.

Every rule in Backus-Naur form has the following structure:

$\mathit{name}$ $\mathit{expansion}$

The symbol means "may expand into" and "may be replaced with."

In some texts, a name is also called a non-terminal symbol.

Every name in Backus-Naur form is surrounded by angle brackets, , whether it appears on the left- or right-hand side of the rule.

An $\mathit{expansion}$ is an expression containing terminal symbols and non-terminal symbols, joined together by sequencing and choice.

A terminal symbol is a literal like ( or ) or a class of literals (like ).

Simply juxtaposing expressions indicates sequencing.

A vertical bar indicates choice.

For example, in BNF, the classic expression grammar is:

<expr> ::= <term> "+" <expr> | <term> <term> ::= <factor> "*" <term> | <factor> <factor> ::= "(" <expr> ")" | <const> <const> ::= integer

Naturally, we can define a grammar for rules in BNF:

$\mathit{rule}$ $\rightarrow$ $\mathit{name}$ $\mathit{expansion}$
$\mathit{name}$ $\rightarrow$ $\mathit{identifier}$
$\mathit{expansion}$ $\rightarrow$ $\mathit{expansion}$ $\mathit{expansion}$
$\mathit{expansion}$ $\rightarrow$ $\mathit{expansion}$ $\mathit{expansion}$
$\mathit{expansion}$ $\rightarrow$ $\mathit{name}$
$\mathit{expansion}$ $\rightarrow$ $\mathit{terminal}$

We might define identifiers as using the regular expression .

A terminal could be a quoted literal (like , or ) or the name of a class of literals (like ).

The name of a class of literals is usually defined by other means, such as a regular expression or even prose.

Extended BNF (EBNF) notation

Extended Backus-Naur form (EBNF) is a collection of extensions to Backus-Naur form.

Not all of these are strictly a superset, as some change the rule-definition relation to , while others remove the angled brackets from non-terminals.

More important than the minor syntactic differences between the forms of EBNF are the additional operations it allows in expansions.

Option

In EBNF, square brackets around an expansion, , indicates that this expansion is optional.

For example, the rule:

<term> ::= [ "-" ] <factor>

allows factors to be negated.

Repetition

In EBNF, curly braces indicate that the expression may be repeated zero or more times.

For example, the rule:

<args> ::= <arg> { "," <arg> }

defines a conventional comma-separated argument list.

Grouping

To indicate precedence, EBNF grammars may use parentheses, , to explictly define the order of expansion.

For example, the rule:

<expr> ::= <term> ("+" | "-") <expr>

defines an expression form that allows both addition and subtraction.

Concatenation

In some forms of EBNF, the operator explicitly denotes concatenation, rather than relying on juxtaposition.

Augmented BNF (ABNF) notation

Protocol specifications often use Augmented Backus-Naur Form (ABNF).

For example, RFC 5322 (email), uses ABNF.

RFC 5234 defines ABNF.

ABNF is similar to EBNF in principle, except that its notations for choice, option and repetition differs.

ABNF also provides the ability to specify specific byte values exactly -- detail which matters in protocols.

In ABNF:

  • choice is ; and
  • option uses square brackets: ; and
  • repetition is prefix; and
  • repetition or more times is prefix; and
  • repetition to times is prefix.

EBNF's becomes in ABNF.

Here's a definition of a date and time format taken from RFC 5322.

date-time = [ day-of-week "," ] date time [CFWS] day-of-week = ([FWS] day-name) / obs-day-of-week day-name = "Mon" / "Tue" / "Wed" / "Thu" / "Fri" / "Sat" / "Sun" date = day month year day = ([FWS] 1*2DIGIT FWS) / obs-day month = "Jan" / "Feb" / "Mar" / "Apr" / "May" / "Jun" / "Jul" / "Aug" / "Sep" / "Oct" / "Nov" / "Dec" year = (FWS 4*DIGIT FWS) / obs-year time = time-of-day zone time-of-day = hour ":" minute [ ":" second ] hour = 2DIGIT / obs-hour minute = 2DIGIT / obs-minute second = 2DIGIT / obs-second zone = (FWS ( "+" / "-" ) 4DIGIT) / obs-zone

Regular extensions to BNF

It's common to find regular-expression-like operations inside grammars.

For instance, the Python lexical specification uses them.

In these grammars:

  • postfix means "repeated 0 or more times"
  • postfix means "repeated 1 or more times"
  • postfix means "0 or 1 times"

The definition of floating point literals in Python is a good example of combining several notations:

floatnumber ::= pointfloat | exponentfloat pointfloat ::= [intpart] fraction | intpart "." exponentfloat ::= (intpart | pointfloat) exponent intpart ::= digit+ fraction ::= "." digit+ exponent ::= ("e" | "E") ["+" | "-"] digit+

It does not use angle brackets around names (like many EBNF notations and ABNF), yet does use (like BNF). It mixes regular operations like for non-empty repetition with EBNF conventions like for option.

The grammar for the entire Python language uses a slightly different (but still regular) notation.

Grammars in mathematics

Even when grammars are not an object of mathematical study themselves, in texts that deal with discrete mathematical structures, grammars appear to define new notations and new structures.

For more on this, see my article on translating math into code.

Beyond context-free grammars

Regular expressions sit just beneath context-free grammars in descriptive power: you could rewrite any regular expression into a grammar that represents the srings matched by the expression. But, the reverse is not true: not every grammar can be converted into an equivalent regular expression.

To go beyond the expressive power of context-free grammars, one needs to allow a degree of context-sensitivity in the grammar.

Context-sensitivity means that terminal symbols may also appear in the left-hand sides of rules.

Consider the following contrived grammar:

<top> ::= <a> ")" <a> ::= "(" <exp> "(" <exp> ")" ::= 7

may expand into ;
which may expand into ;
which may expand into .

While this change appears small, it makes grammars equivalent to Turing machines in terms of the languages they can describe.

By restricting the rules so that the the left-hand side has strictly fewer symbols than all expansions on the right, context-sensitive grammars are equivalent to (decidable) linear-bounded automata.

Even though some languages are context-sensitive, context-sensitive grammars are rarely used for describing computer languages.

For instance, C is slightly context-sensitive because of the way it handles identifiers and type, but this context-sensitivity is resolved by a special convention, rather than by introducing context-sensitivity into the grammar.

Parsing

This article covered the process of interpreting grammars and common notations.

A closely related topic is parsing.

Parsing takes a grammar and a string and answers two questions:

  1. Is that string in the language of the grammar?
  2. What is the structure of that string relative to the grammar?

For an comprehensive treatment of parsing techniques, I recommend Grune and Jacobs, Parsing Techniques: A Practical Guide:

As an aside, if you think you've invented a new parsing technique, you need to check this book first. Your peer reviewers will check it.

My own articles on parsing may also serve as a useful reference: