Data processing with multi-database, off database, log, txt and excel: A Code Example for Reading and Writing EXCEL Files with esProc

It is convenient to read and write Excel files with esProc. We'll illustrate this through an example:

Example description: Read data of orders and sellers’ information list from data.xlsx, compute sales amount of every department according to the two sheets, and enter the result into result.xlsx.

Original Excel File: data.xlsx contains two sheets. The data of orders is shown in the following figure:

Sellers' information list is as follows:

esProc code:

A1-A2：Read the first sheet, data of orders, and the second sheet, sellers’ information list, respectively from the excel file, data.xlsx, and stores them in cell A1 and A2 in the form of table sequence.

A3-A4：First, make a left join of data of orders and seller’s information list according to employee ID numbers, then seek the sales amount of each department by grouping and summarizing. We won’t discuss the algorithm here for it is not important in this example.

A5：Store the result in result.xlsx.
result.xlsx

Code interpretation

Column name

importxls uses function option @t, which shows that the first row of sheet is regarded as column name of esProc table sequence. For example, click cell A1 and the values of its variables can be seen:

Therefore, the algorithm that follows can access data in A1 using column names directly.

Sheet name

The above code reads data according to the sheet’s serial number, but sometimes, clients prefers sheet name. This demand can be realized in esProc. For instance, “sales” is the sheet name for data of orders. In this case, reading by names will be realized only by changing the sheet’s serial number in A1 into sheet name directly. See below:
file("E:/data.xlsx").importxls@t(;”sales”)
The same will do when writing a result into a file. Say, we want to export result in A4 to a sheet named summary, the code is:
file("E:/result.xlsx").exportxls@t(A4;”summary”)

Scope of data

The above code is to read data from the first row of sheet to the end by default. In reality, however, we are often confronted with cases that a portion of the data is to be read. As shown in the following figure:

Importxls function defines the scope of row numbers waiting to be read. Say, reading from the 4th row, which could be written as:
file("E:/data.xlsx").importxls@t(;1,4)

Reading from the 4th row to 1000th row, which could be written as:
file("E:/data.xlsx").importxls@t(;1,4:1000)
Column numbers for reading can also be defined. Say, three columns OrderID, SellerId and Amount are to be read, the code for this could be:
file("E:/data.xlsx").importxls@t(OrderID,SellerId,Amount;1)

Or read by column numbers:
file("E:/data.xlsx").importxls@t(#1,#3,#4;1)

Extension:
With for loop statement, esProc's can read and write Excel files in batches.
With parameters and macros, esProc makes computations based on Excel data source more flexible.

By providing computations of multiple data sources, esProc is able to do hybrid computation of database, text files and Excel, as well as data migration.

Data processing with multi-database, off database, log, txt and excel

2014年7月27日星期日

A Code Example for Reading and Writing EXCEL Files with esProc

没有评论:

发表评论