esProc is a handy tool for testing data preparation.
Now we need to prepare the testing data for employee’s information in text format, including employee number, name, gender, date of birth, city and state of residence, etc. Through this example, we can understand the way testing data are being prepared.
We have the following requirements for testing data: the employee numbers are generated sequentially. Name and gender are randomly generated. Birthdays are randomly generated, however we need to ensure that the current age of the employees are between 18 to 55 years.City and states were randomly obtained from a table in database.
In 3 text files Top100MaleNames.txt, Top100FemaleNames.txt and Top100Surnames.txt, there are 100 most used male and female names, and surnames stored.
The cities of employees need to be retrieved randomly from the CITIES table in database:
According to the STATEID fields in CITIES table, we can retrieve the abbreviation of the state for the employee from STATES table:
Note that when generating the employee information, the name of the employee is related to his/her gender. Therefore we need to retrieve the text data first, combine the most used male and female names, and add the gender field to them:
After sorting, we can see in C2 the following sorted table consists of name and gender:
Similarly, the city and abbreviation of states are also related. After retrieving data from database, the abbreviation of states is added to city information:
And A4:
Then the basic information of generated data are sorted, including the data structure for employee information table, and amounts of testing data to be generated, etc.:
Among this, the
number in C5is the definition of cache, meaning that after generation of every 1500
records we need to input data to the text file once. This way we can control
the memory space being used. In B6 the data structure of employee information
table is output to the text file.
As the next step, we can now run a loop to generate the testing data for every employee:
B7 generates
a random sequence number as reference to names, while C7 generates one for surnames. They are used to
retrievethename and gender for the employees. Accordiing to the requirements, B11randomly
generatesthe age,andaccording to the age, selecting a
random date in the correspondingyearincode line 12as this employee's
birthday. In line 13, 14of the code, randomlyselect
a city and to get the city and state for the employee.
After all required data are generated, B15 will
add all data to the sorted table of employee information created in A5. A16 controls the data output,and write data to text file after every 1500 records. After data output A5 is dumped, to
avoid occupying too much memory.
After all data output, the text file are as following:
When preparing
testing data with esProc, we can run a loop to generate large
amount of random
data. Meanwhile, in the loop, we can retrieve existing database data or text
data easily, to generate data according to business needs and avoid writing complex
programs.
没有评论:
发表评论