Understanding Clusters

Often when programming there is a need to combine multiple variables into a single named object. Doing so makes it easier to keep track of your variables and adds clarity to your code. In Sheerpower, we call this object a CLUSTER. Some programming languages call these a struct or a vector.

From a programming perspective, Sheerpower's clusters are highly advantageous, especially for business applications. They allow for flexible and efficient data management by enabling any field within a cluster to be used as a key. This feature simplifies complex data handling, such as dynamically accessing and manipulating data from CSV files or databases. Clusters offer a blend of simplicity and power, streamlining data operations that might otherwise require more elaborate code in other languages.

Each cluster is given a name and a list of variables associated with that name. These variables are sometimes also called cluster fields or fields.

Simple clusters have no rows. They are one-dimensional and are typically used to store related information about a single overall concept. For example, below is a cluster that stores the types of food that one has at meal time:

cluster meals: protein$, liquid$, carb$ meals->protein$ = "Eggs" meals->liquid$ = "Tea" meals->carb$ = "Toast"

Scalar Cluster:
A scalar cluster in Sheerpower is a single, structured data object that groups together related variables under one name. It's similar to a record or struct in other programming languages. Each variable within the cluster, known as a cluster member or field, holds a single value. A scalar cluster is one-dimensional and represents a single instance of data, such as information about a person or a product. Optionally, each field can be explicitly declared with a data type, such as int, real, or string.

Example:

cluster employee: name$, age, position$, real salary
employee->name$ = "John Doe"
employee->age = 30
employee->position$ = "Manager"
employee->salary = 70_000

In this example, employee is a scalar cluster that holds information about a single employee. It includes fields for name$, age, position$, and salary. The field salary has been explicitly declared as a data type of real.

Cluster Array:
A cluster array, on the other hand, is a collection of multiple scalar clusters, arranged like rows in a spreadsheet. Each row in the cluster array represents an individual record or row, and each record contains multiple fields. This structure allows you to manage large sets of related data, similar to how you would use a database table or a spreadsheet.

Note: Cluster arrays can be extremely large. If your computer has sufficient RAM and pagefile space, clusters can support up to one billion rows with key lookup speeds exceeding millions per second.

Example:

cluster student: name$, age, grade$
add cluster student
  student->name$ = "Alice"
  student->age = 20
  student->grade$ = "A"
add cluster student
  student->name$ = "Bob"
  student->age = 21
  student->grade$ = "B"

In this example, student is a cluster array, where each add cluster statement adds a new row to the array, representing a different student. You can then access, manipulate, and iterate through these rows.

Key Differences:

Single vs. Multiple Records: A scalar cluster holds a single record of related fields, whereas a cluster array holds multiple records, each with the same set of fields.
Data Organization: Scalar clusters are used when you need to manage a single entity's data, while cluster arrays are ideal for handling multiple entities with similar data structures.
Iteration: Cluster arrays allow you to iterate over multiple rows, making them suitable for tasks that involve working with lists of data, like databases or spreadsheets.
Use Case: Use a scalar cluster for simpler, individual data objects and a cluster array for more complex datasets that require storing and managing multiple records.

Notice that the syntax for referencing cluster variables is:

clustername->variablename

If you have multiple clusters all based on the same "root" cluster, such as MEALS, you can define the root cluster once and then reference it when defining related clusters. In this example, we first define a MEALS cluster and then use it to further define breakfast, lunch, and dinner clusters.

cluster meals: protein$, liquid$, carb$ cluster breakfast using meals breakfast->protein$ = "Eggs" breakfast->liquid$ = "Tea" breakfast->carb$ = "Toast" cluster lunch using meals lunch->protein$ = "Chicken" lunch->liquid$ = "Coffee" lunch->carb$ = "Rice" cluster dinner using meals dinner->protein$ = "Steak" dinner->liquid$ = "Wine" dinner->carb$ = "Rice"

If you have a routine that requires a lot of data passed into it, using a cluster is a great way to pass in the data.

The print cluster CLUSTERNAME, list statement shows the names of each cluster member and their value.
By default print cluster CLUSTERNAME outputs the cluster fields in CSV format.

When declaring a cluster, you can specify initial values for fields. These values will be set when the cluster is created, providing a convenient way to initialize the cluster with default data.

cluster diet: protein$ = "Steak", liquid$ = "Wine", carb$ = "Rice", calories = 2000

Sheerpower handles NULL or undefined values in clusters by treating them as empty strings or zero values, depending on the data type. When performing operations like sorting or printing, these NULL values are handled gracefully, but developers should be aware of their presence and account for them in logic where necessary.

Summary: Understanding Clusters in Sheerpower

Clusters in Sheerpower are versatile data structures designed to simplify data management and enhance code clarity, particularly for business applications. They allow developers to group related variables under one name, making data handling efficient and intuitive.

Key Features of Clusters:

Reduces complexity in managing related variables.
Improves code readability and maintainability.
Enables seamless data handling for both single entities and large datasets.
Handles NULL or undefined values gracefully.

Scalar Clusters: Single-instance data objects that store related fields, ideal for managing data about individual entities like employees or products.

Cluster Arrays: Collections of scalar clusters organized like rows in a table, perfect for handling very large datasets such as transactional records or large inventory lists.

Flexibility: Any field in a cluster can act as a key, enabling dynamic data access and manipulation.

Initialization: Clusters can be initialized with default values for easy setup and use.

CSV Integration: Clusters can output fields in CSV format using the print cluster statement, streamlining integration with other data systems.

Benefits of Using Clusters:

Improves code readability and maintainability.
Enables seamless data handling for both single entities and large datasets.
Handles NULL or undefined values gracefully.

Whether managing simple data objects or complex datasets, Sheerpower clusters offer a powerful combination of simplicity and flexibility. Their ability to adapt to various programming needs makes them a key tool for efficient and elegant code design.

(Show/Hide Sheerpower Cluster Takeaways)

Understanding Clusters

Summary: Understanding Clusters in Sheerpower

Key Features of Clusters:

Benefits of Using Clusters:

Sheerpower Cluster Takeaways

Enter or modify the code below, and then click on RUN