|
D.5 Using Clusters to Work with Spreadsheets
|
Sheerpower clusters allow you to work with spreadsheets (.csv files) as if they are
database tables. Clusters are very fast, processing millions of rows per
second. Any column can be treated as a lookup key. Lookup speeds of over 10
million per second are typical.
More with Spreadsheets
The cities dataset is 10,000 rows of the largest cities in the world. It
includes the city name, country code, and population for each one.
The
cities spreadsheet is input into a program using the
cluster input
statement. You give the name of the spreadsheet CSV file, the number of
headers, and the name of the cluster you will be using.
cluster cities: City$, Country$, Population, Region$, Latitude, Longitude
cluster input name '@..\safe\safe_world_cities.csv', headers 1: cities
print 'cities input: '; size(cities)
Note: The CLUSTER INPUT statement has many options and can input
JSON files, tab-separated-value (TSV) files, and even X12 files into clusters. For details see:
Cluster Input
Clusters and JSON Formatted Data
SheerPower also supports loading JSON files into clusters using the
cluster input statement. Below are the details and examples:
Details
The cluster input statement allows you to read JSON
files and load their data into a SheerPower cluster. The structure of
the JSON file should have a consistent number of name/value pairs
for each "row". The data is loaded sequentially into the cluster without
regard to the field names.
JSON Example
Consider a JSON file named @json_prefixed.json with the following content:
{
"people3":[
{
"id":"one",
"name": {
"first":"Fred",
"last":"Smith"
},
"price":234.56
},
{
"id":"two",
"name": {
"first":"Sally",
"last":"Sue"
},
"price":123.45
}
]
}
To load this JSON data into a SheerPower cluster, use the following code:
cluster people3: id$, name->first$, name->last$, price
cluster input name '@json_prefixed.json': people3
This code will read the JSON file and load the data into the people3 cluster. Each object in the JSON array is treated as a row in the cluster.
Cluster and EDI X12 file formats
This program demonstrates how to process an X12 file into a Sheerpower cluster:
version$ = "v1.0"
record_delimiter$ = "~"
field_delimiter$ = "*"
cluster x12_data: segment$
cluster input name 'x12_file.txt', record record_delimiter$, field field_delimiter$: x12_data
collect cluster x12_data
end collect
for each x12_data
print x12_data->segment$
next x12_data
stop
Sample X12 File (x12_file.txt):
ISA*00* *00* *ZZ*MYORG *ZZ*Interchange Rec*200831*2324*U*00501*100030537*0*P*>~
GS*SH*MYORG*Interchange Rec*20200831*2324*200030537*X*005010~
ST*856*300089443~
BSN*00*P1982123*20200831*2324*0001~
DTM*011*20200831*2324~
HL*1**S~
TD1*CTN*1****G*2*LB~
TD5**2*FDE*U********3D*3D*3D~
DTM*011*20200831~
N1*ST*SOO KIM*92*DROPSHIP CUSTOMER~
N3*26218 QUENTIN TURNPIKE~
N4*REANNAFORT*MS*51135~
HL*2*1*O~
PRF*1881225***20200831~
HL*3*2*P~
MAN*CP*037178019302492~
HL*4*3*I~
LIN*1*VC*11216.32*SK*RGR-11216.32~
SN1*1*1*EA~
PID*F****ALL TERRAIN ENTRY GUARD KIT 2020 JEEP WRANGLER J~
CTT*4~
SE*20*300089443~
GE*1*200030537~
IEA*1*100030537~
CSV Spreadsheet Example
In the example code below, you can input data into the
a$ variable and
then click
run to try different cities.