Equal grouping
Grouping basis of equal grouping is certain fields (or computed
columns derived from fields) within a data set. Each group is a subset of
original data set.
Case
description: Group sales orders by the year.
Data
description: Data of sales orders are shown below:
Or, combine grouping and summarizing into one step with groups function:
Alignment grouping
Criterions used
for equal grouping come from within a dataset. But sometimes, they
originate from without,
like fields of other data sets, arrays created by users, parameter list and so
on. Thus the alignment grouping comes into being.
Different from equal grouping, method of alignment grouping may produce empty subsets, which have no members to correspond to data of a group. It may also produce incomplete groups, meaning that some data won't be included in any group. These things won't happen for equal grouping.
Case description: Table of top 10 sellers
has been worked out in the light of performance, please group sales orders
according to the table order.
Enumeration grouping
The grouping basis for enumeration grouping could be more flexible.
It could be any Boolean expressions. Those records consistent with the
expression will get into the same group.
Similar to alignment grouping, enumeration grouping is also of
incomplete grouping, probably producing empty subsets or results that some
records are not included in any group. In addition, with this grouping method,
it is likely that some records may appear in more than one group.
Case description: Divide sales orders into
four groups, they are: A. order amount is less than 1000; B. order amount is
less than 2000; C. order amount is less than 3000; D. order amount is less than
10,000. Note that the data cannot be grouped repeatedly, that is, if an order
has been in group A, it must not be put into group B, C, or D.
The above data set (table sequence) can be accessed from a database
or a file. For example:
A1=file("E:/sales.txt").import@t()
esProc code:
A2=A1.group(year(OrderDate))
Computed results:
Code interpretation:
1. In this example, grouping basis
comes from OrderDate. The date of sales order will be converted into the year
through year(OrderDate), and data of the same year will be grouped together.
2.There may be multiple fields
for grouping. For example, regroup data of different years and sellers
according to year and sellers. The code is:
A1.group(year(OrderDate),SellerId)
3.Often, the grouped data are
used to perform aggregation operations, such as, compute each year’s sales
amount according to A2. The code is:
A2.new(year(OrderDate):y,~.sum(Amount):a)
Computed
results are:Or, combine grouping and summarizing into one step with groups function:
A1.groups(year(OrderDate):y; sum(Amount):a)
Of course, sometimes we have to execute grouping and
summarizing separately in order to reuse the code and improve computational
efficiency. For example, filter one of the groups of A2, and perform
association computation for another group. Another situation is that, if, after
summarizing, data of a certain group are unusual and worth further study, then
this group can be used directly to go on with the computations. It’s no need to
filter the group again.
4. By default, esProc's group
function will group data by using hash algorithm.
But, comparing adjacent rows can have higher performance for ordered data. This
can be executed by using option @o in group function. For
example:
A1.group@o(year(OrderDate),SellerId)
Alignment grouping
Criterions used
for equal grouping come from within a dataset. But sometimes, they
originate from without,
like fields of other data sets, arrays created by users, parameter list and so
on. Thus the alignment grouping comes into being.Different from equal grouping, method of alignment grouping may produce empty subsets, which have no members to correspond to data of a group. It may also produce incomplete groups, meaning that some data won't be included in any group. These things won't happen for equal grouping.
The sales orders are the same as those
in above example. Data are stored in A1.
Table
of the top 10 sellers is stored in B1 as follows:
Sellers
table may come from a temporary table, or be generated by a piece of code. The
generating process is not the focus of this example.
Code interpretation:
1. In this example, the
grouping basis (sellers table) comes from without the data set to be grouped.
After grouping is completed, a group contains only the data of one seller, and
groups are sorted according to sellers table.
2. Because sellers in sales orders outnumber those
in sellers table, some of the orders won’t appear in any groups. If we want to
create one more group to store these orders, we can use function option @n
as follows:
A1.align@a@n(B1:empID,SellerId)
The one more group will be put last as follows:
3. Sometimes, the grouping basis is not within the data set
to be grouped, such as, “newly-employed sellers table”. In this case, it’s
normal to produce empty groups. Modify the first group of data in the table
into empID=100, for example, the computed results will be:
Enumeration grouping
The grouping basis for enumeration grouping could be more flexible.
It could be any Boolean expressions. Those records consistent with the
expression will get into the same group.
The data set before grouping:
The
sales orders are the same as those in above example. Data are stored in A1.
esProc
code:
A2=["?<=1000","?<=2000","?<=3000","?<=10000"]
A3=A1.enum(A2,Amount)
Computed results:
Case interpretation:
1.In this example, grouping basis
for grouping is multiple flexible expressions. Each record will be compared
with the expressions. Those consistent with the same expression will be put
into the same group. Groups are sorted according to order of grouping basis as
well.
2.By default, enumeration
grouping will not produce identical results. The method, which is showed in the
above example, is that after group A’s data are selected, the rest of data will
be compared with expression B to see their consistency. While the use of
function option @r, which represents that all data are compared with
expression B, may produce identical results. For example: A3=A1.enum@r(A2,Amount), computed results are:
3. Similar to alignment grouping,
if the expression for enumeration grouping is inconsistent with any data to be
grouped, empty group will appear. Besides, if some data are inconsistent with
any expression, function option @n can be used to put them into a surplus
group.
没有评论:
发表评论