Download free books at BookBooN.com
An Introduction to Relational Database Theory
127
Building on The Foundation
Points to note:
x Some aggregations can be thought of in terms of repeated invocation of some dyadic operator,
which I shall call the basis operator. In the case of SUM, for example, the basis operator is
addition. Because addition is commutative and associative, we could define an n-adic form of the
operator, just as we did in Chapter 4 for operators such as JOIN and UNION. If we call this
operator ADD, then we would have, for example, ADD(1,4,1,5) = ((1+4)+1)+5. But
those operands, 1, 4, 1, and 5, can be given in any order (thanks, in this case, to the commutativity
and associativity of +), and that lack of any significance to the ordering is what allows us to
define aggregate operators for relations. The lack of an ordering to the tuples of a relation
militates against defining aggregate operators whose results vary according to the order in which
the operands are presented. Consider string concatenation, for example. We can concatenate any
number of strings together to form a single string, but the result depends on the order in which the
input strings are presented.
x The basis operators for MAX and MIN might reasonably be called HIGHER and LOWER,
respectively, where HIGHER(x,y) returns x unless y>x, in which case it returns y, and
LOWER(x,y) returns x unless y<x, in which case it returns y. You can confirm for yourself that
HIGHER and LOWER are commutative and associative.
x If the relation operand is empty, then the result of aggregation can be defined only if the basis
operator has an identity value.
xii
In the case of SUM, the basis operator is addition, whose identity
value is zero. In the cases of MAX and MIN, the type of the result is the type of the attribute given
as the second operand. The identity value of the basis operator depends on that type. If the type
has a defined least value, min, such that min>v is FALSE for all values v of that type, then min is
the identity under HIGHER. If a least value is not defined, then there is no identity value under
HIGHER, and MAX of the empty relation is undefined for attributes of that type. Similarly,
MIN(r,a) is defined only when a greatest value is defined for the type of attribute a.
x The examples shown use a simple attribute name as the second operand, and Version 1 of
Tutorial D in fact requires that operand to be a simple attribute name. In general, however, the
second operand in invocations of SUM, MAX, and MIN should be allowed to be any expression of
an appropriate type (obviously a numeric type in the case of SUM). Version 2 of Tutorial D does
indeed allow this.
x The simple attribute names used in my examples are cases of open expressions, as defined in
Chapter 4, Section 4.7. As in other places where open expressions are permitted, closed
expressions are also permittedallowing us to sagely observe, for example, that SUM(r,1) is
equivalent to COUNT(r).
Several other aggregate operators are defined in Tutorial D. Here are some that we can now deal
with summarily:
AVG ( r, x ) is equivalent to SUM ( r, x ) / COUNT ( r ) and is therefore undefined in the case where
r is empty. As an exercise, the reader might like to consider whether there can be a basis operator
for AVG.