# Advanced DTL usage 

DTL is quite easy to begin using and in simple usage is very easy to 
understand. This simplicity is deceptive, as DTL is extremely powerful.
With familiarity with some of the more advanced features of DTL you can
do amazing things.

If you have not already, be sure to read the [README](../README.md), 
[DTL Expression Guide](./DTL-Expressions.md) and [Helper guide](./DTL-Helpers.md)

## A word about undefined and null

Often in data you will encounter undefined or `null` values. DTL itself has no
separate concept of `null`, only `undefined` and therefore considers `null` and
`undefined` in input data to be equivalent. 

It should be noted that while DTL understands undefined values and handles them
properly, it can appear as though it doesn't because when outputting JSON,
undefined properties can disappear. This is because in Javascript the default
JSON stringifier eliminates any object properties that are null or undefined.
You can override this by using a replacer function with the `JSON.stringify()`
call.  [This page](https://muffinman.io/blog/json-stringify-removes-undefined/)
describes how to do this.

## The concept of empty

Often in programming, the need arises to determine whether a particular
variable holds a value or not. In JavaScript, you encounter the types
`undefined` and `null` to represent the absence of value. These types, while
useful, come with their quirks that could lead to unexpected behaviors or bugs
if not handled carefully. For instance, you may have stumbled upon scenarios
where the distinction between `undefined`, `null`, and other falsy values like
an empty string `''` could lead to different outcomes in your code, sometimes
making the code more verbose with additional checks.

Here's a quick refresher on how JavaScript treats `undefined` and `null`:

```javascript
let val = undefined;
console.log(typeof val); // 'undefined'

val = null;
console.log(typeof val); // 'object' - a bit unexpected!

Object.keys(val);  // throws an exception :(
```

Both `undefined` and `null` signify the absence of a value in JavaScript, but
as demonstrated, they behave differently. Moreover, when dealing with JSON,
only `null` is recognized, leaving `undefined` out in the cold, which can
complicate data handling.

To streamline the handling of such scenarios, DTL introduces the concept of
`empty`. In DTL, an item is considered `empty` if it holds no meaningful value.
This includes variables that are `undefined`, empty strings `''`, arrays with
no elements, and objects with no properties. This unified approach towards
handling the absence of value simplifies data checks, making your transforms
more straightforward and less error-prone compared to handling `undefined`,
`null`, and other falsy values separately as in JavaScript.

In the following sections, we'll explore how DTL's `empty` concept is leveraged
through various helpers and how it contrasts with JavaScript's approach, aiming
to provide a clearer, more efficient way of handling the absence of value in
your data transformations.

It is possible in DTL to determine the type of a value, by using the `typeof()`
helper, but in almost every situation, using the concept of `empty` is
preferable.

You can determine if a value is empty using the `empty()` helper.  Once again,
an example is helpful:

```
empty(undefined) // true
empty('') // true
empty(' ') // false - contains a space
empty([]) // true
empty({}) // true
```

Why is this useful? Because in almost every situation, what you really want to
know about the data isn't whether it has a value of `undefined` or similar,
it's whether it has a meaningful value. `empty()` tells you whether it has a
meaningful value or not in all situations.

For example, if an object is `empty()` there may be no need to process it.
Likewise for an array.  If you need some data, and the place you are looking
for it is `empty()`, you will need to look elsewhere. So making use of
`empty()` can make your transforms much simpler than the equivalent code.  

Related to the `empty()` helper is the *First Not Empty*, or `fne()` helper.
As it's name implies when you give `fne()` multiple values, it will return the
first one that is not empty. This can be especially useful for getting a value
from one of multiple places, or falling back to a default. For example:

```
fne($user.nickname $user.first_name 'User')
```

will try to obtain a value from `$user.nickname`. If that has a non-empty value
it will be returned. If, however, it is empty, it will look in
`$user.first_name` and if that is also empty, will return the string 'User'.
Note that because of the concept of `empty()`, you don't need to concern
yourself with checking whether `$user.nickname` is `undefined` or `null` or an
empty string.  In most cases, for the purposes of transforming data, they are
all equivalently `empty()`

As you can see, `fne()` makes it easy to do multiple fallback lookups and set
default values.

As mentioned, if you really need to know, you can use `typeof()` to determine
whether something is actually `string` or `undefined`.  You can also use `==`
to check its actual value, for example:

```
$v == '' // true if $v is an empty string, false if $v is undefined
```

But in most cases, we have found that we mostly just want to know if a value is
`empty()` or not.

### Comments 

When creating transforms, it can be helpful to add comments. DTL understands
two types of comments in an expression. If you are familiar with C, Javascript,
or other languages with a C-derived syntax, these will be familiar to you.  

```
/* Find what we call them */ fne($user.nickname $user.first_name 'User')

empty($user.nickname) // Does user have a nickname defined?
```

As you expect, `/* ... */` can be placed anywhere in an expression, and the
`// ...`  marks everything until the end of the line as a comment.  


### A word about whitespace 

DTL expressions are fairly straightforward and do not require much extra
punctuation.  DTL understands that within an array context `[ ... ]` for
example, that each item separated by whitespace is an additional element, so
commas are unnecessary. Likewise, no special termination character is required
to signal the end of an expression, etc. They simply end at the end of the
expression.

DTL treats spaces, tabs and newlines as whitespace and all are equivalent. This
means that you can create complex transforms using multiple lines and indenting
for clarity.

This can be hard to see when transforms are encoded using JSON because JSON
does not support multi-line strings directly, but JSON is only one way of
encoding transforms. If you encode your transforms some other way, such as YAML
or [JSON5](https://json5.org), or extract them directly from a database
newlines and indentation can be quite helpful and DTL will happily let you
indent and newline your expressions however you see fit.

## Understanding DTL Transform Libraries

DTL (Data Transformation Language) facilitates the definition and execution of
transformations on data. A central feature in DTL is transform libraries, which
are collections of named transforms grouped together.

### Defining Transform Libraries

A transform library is an object where each key represents a transform name,
and the value is the transform definition. This structure allows for organizing
multiple related transforms together. Transforms within a library can reference
each other, and can be accessed both internally and externally.

Here's a simplified example of a transform library:

```json
{
  "out": {
    "original_value": "(: $num :)",
    "rounded": {
        "num": "(: $num -> 'round_number' :)"
    }
  },
  "round_number": "(: math.round($.) :)"
}
```

### Applying Transforms

Transforms are triggered using the `DTL.apply` function, which accepts three
parameters:

- `input_data`: The data subject to transformation.
- `transform_library`: The library comprising the transforms.
- `transform_name_to_run`: The specific transform to execute. If omitted, the 'out' transform is used.

```javascript
DTL.apply(input_data, transform_library, transform_name_to_run);
```

### Recursive Processing

If a transform is an object or array, DTL processes it recursively, traversing
its structure and executing any enclosed expressions. This feature is
invaluable for constructing specific data structures based on input data.

### Example: Creating a Nested Object

Given a flat data object with person details:

```json
{
  "name": "John Doe",
  "street": "123 Main St",
  "city": "Springfield",
  "state": "IL",
  "zip": "62704"
}
```

A transform library can be crafted to organize this data into a nested format:

```json
{
  "out": "(: $ -> 'create_nested_object' :)",
  "create_nested_object": {
    "person": {
      "name": "(: $.name :)"
    },
    "address": {
      "street": "(: $.street :)",
      "city": "(: $.city :)",
      "state": "(: $.state :)",
      "zip": "(: $.zip :)"
    }
  }
}
```

In this setup, `create_nested_object` is a transform that builds a nested
object from the flat input data. The `out` transform references
`create_nested_object`.

### Advanced Example: Multiple Transforms and Specified Transform Execution

Consider an input object with numerical data:

```json
{
  "num": 105,
  "num2": 42
}
```

We can design a transform library to perform various operations on these
numbers:

```json
{
  "out": {
    "original_value": "(: $num :)",
    "rounded": {
        "num": "(: $num -> 'round_number' :)",
        "num2": "(: $num2 -> 'round_number' :)"
    },
    "num_is_big": "(: $num -> 'is_big' :)"
  },
  "round_number": "(: math.round($.) :)",
  "is_big": "(: $. >= 100 :)"
}
```

Here, besides the `out` transform, two additional transforms `round_number` and
`is_big` are defined. They round a number and check if a number is big (>=100),
respectively.

If we like, instead of defaulting to the 'out' transform, we can specify a transform
name in `DTL.apply`:

```javascript
DTL.apply(input_data.num, transform_library, 'is_big');
```

This will only evaluate the `is_big` transform, checking if `num` is greater
than or equal to 100.

Through this mechanism, DTL offers a flexible, organized approach to define and
apply complex data transformations, rendering it a potent tool for data
manipulation and structure definition.


### Example: Processing an Array of People using `map`

Consider an input array containing flat objects with person information:

```json
[
  {
    "name": "John Doe",
    "street": "123 Main St",
    "city": "Springfield",
    "state": "IL",
    "zip": "62704"
  },
  {
    "name": "Jane Smith",
    "street": "456 Elm St",
    "city": "Rivertown",
    "state": "TX",
    "zip": "75001"
  }
]
```

We can design a transform library to organize this data into nested structures for each person:

```json
{
  "out": "(: $. -> 'process_people' :)",
  "process_people": "(: map($. 'create_nested_object') :)",
  "create_nested_object": {
    "person": "(: $item -> 'get_person' :)",
    "address": "(: $item -> 'get_address' :)"
  },
  "get_person": {
    "first_name": "(: (split($.name ' '))[0] :)",
    "last_name": "(: (split($.name ' '))[1] :)"
  },
  "get_address": {
    "street": "(: $.street :)",
    "city": "(: $.city :)",
    "state": "(: $.state :)",
    "zip": "(: $.zip :)"
  }
}
```

In this setup:

- The `out` transform calls `process_people`.
- `process_people` employs the `map` helper to apply `create_nested_object` to each item in the input array.
- `create_nested_object` constructs a nested object from each flat person object, using the `get_person` transform for the person and the `get_address` transform to create the address structure.

The `map` helper is instrumental as it iterates over each item in the input array, applying the specified transform, and gathering the results into a new array. This way, each flat person object is transformed into a nested structure, and a new array of these nested structures is produced. 

The usage would be:

```javascript
DTL.apply(input_data, transform_library);
```

This will output an array of nested person objects:

```json
[
  {
    "person": {
      "name": "John Doe"
    },
    "address": {
      "street": "123 Main St",
      "city": "Springfield",
      "state": "IL",
      "zip": "62704"
    }
  },
  {
    "person": {
      "name": "Jane Smith"
    },
    "address": {
      "street": "456 Elm St",
      "city": "Rivertown",
      "state": "TX",
      "zip": "75001"
    }
  }
]
```

### Direct Transform Access in DTL

Besides utilizing the default `out` transform, DTL allows direct access to
other named transforms within a library from your JavaScript code. This feature
is handy for executing specific transformations without going through the
entire library.

For instance, given a single person object:

```javascript
const one_person = {
  "name": "John Doe",
  "street": "123 Main St",
  "city": "Springfield",
  "state": "IL",
  "zip": "62704"
};
```

You can directly access and execute the `get_person` transform using the
`DTL.apply` function:

```javascript
const result = DTL.apply(one_person, transform_library, 'get_person');
```

This call will bypass the `out` transform, and instead, directly execute
`get_person` on the provided person object, producing an output with the first
and last name separated:

```json
{
  "first_name": "John",
  "last_name": "Doe"
}
```

This mechanism offers flexibility to invoke any specific transform defined in
the library, catering to various data processing needs directly from your
JavaScript environment.


## Literal Values in Transform Processing

DTL permits the inclusion of literal values directly within a transform, which
are then reflected in the result. This feature is particularly useful in
scenarios where static values or structures are required in the output
alongside dynamically transformed values. Literal values can be part of
transforms that are objects or arrays.

Let's enhance a previous example to illustrate this concept:

```json
{
  "out": "(: $. -> 'process_people' :)",
  "process_people": "(: map($. 'create_nested_object') :)",
  "create_nested_object": {
    "person": "(: $item -> 'get_person' :)",
    "address": "(: $item -> 'get_address' :)",
    "metadata": {
      "processing_date": "2023-10-25",
      "source": "user_input"
    }
  },
  "get_person": {
    "first_name": "(: (split($.name ' '))[0] :)",
    "last_name": "(: (split($.name ' '))[1] :)"
  },
  "get_address": {
    "street": "(: $.street :)",
    "city": "(: $.city :)",
    "state": "(: $.state :)",
    "zip": "(: $.zip :)"
  }
}
```

In the `create_nested_object` transform, a `metadata` object with literal
values is introduced. These values, `processing_date` and `source`, are static
and will appear as-is in the transformation result. 

Here's a snippet of the output illustrating the inclusion of literal values:

```json
{
  "person": {
    "first_name": "John",
    "last_name": "Doe"
  },
  "address": {
    // ... address fields ...
  },
  "metadata": {
    "processing_date": "2023-10-25",
    "source": "user_input"
  }
}
```

It's important to note that keys of objects within transforms are never
processed as expressions; they are treated as static identifiers. This allows
for a clear demarcation between static structure and dynamic value
transformation in DTL, facilitating precise control over the output format.

## Static Keys vs Dynamic Keys in DTL

In DTL, keys within transforms are kept static and are not processed as
expressions. This design choice ensures the stability and predictability of
your object structure during transformation, preventing inadvertent
modifications which could occur if keys were dynamically interpreted.

However, there are cases where you might need to dynamically generate parts of
your object structure based on data. For these scenarios, DTL offers the
`object creator` syntax `{ }`. This syntax allows you to explicitly create
objects with dynamic keys, giving you the flexibility to construct complex or
data-driven structures while maintaining a clear intention in your code.

With the object creator syntax, you construct key-value pairs in a controlled
manner, specifying exactly how keys are generated and values are assigned,
ensuring that dynamic object creation is both deliberate and transparent. 

This balanced approach allows for a robust yet flexible data transformation
process, catering to a wide range of use cases while preserving the integrity
of your object structures.

### Object Creator Syntax

The object creator syntax allows you to generate an object from one or more
pairs of values. Each pair is a two-element array, where the first item becomes
the key, and the second item becomes the value. Here's the syntax:

```json
{ [ $key $value ] }
```

#### Example:

Given a variable `$first_name` with a value of "John", this expression:

```json
{ [ $first_name length($first_name) ] }
```

Produces an object:

```json
{ "John": 4 }
```

#### Unpacking Objects into Pairs

Conversely, if you have an object and wish to break it down into an array of
key/value pairs, you can use the `pairs` helper. The `pairs` helper takes an
object as an argument and returns an array of key/value pairs.

##### Syntax:

```json
pairs($object)
```

##### Example:

Given an object:

```json
{
  "John": 4,
  "Doe": 3
}
```

Applying the `pairs` helper:

```json
pairs($.)
```

Produces an array of key/value pairs:

```json
[
  ["John", 4],
  ["Doe", 3]
]
```

This complementary functionality allows for flexible manipulation of objects
and arrays within DTL, enabling dynamic key generation and object decomposition
to suit your data transformation needs.

## Flattening and Unflattening Objects in DTL

Handling deeply nested objects efficiently is a common need in data
transformation, especially when you want to interact with or overwrite specific
fields deep within an object without disturbing the rest of the structure. DTL
provides two powerful helpers for this purpose: `flatten($obj)` and
`unflatten($obj)`.

### Flattening Objects

The `flatten()` helper is used to condense a deeply nested object into a
single-level object while preserving the original structure within the key
names. This is achieved by encoding the full key path using dot-notation (or a
custom separator if provided). Here's the syntax:

```plaintext
flatten( $array_or_object [ $separator ] [ $prefix ] )
```

#### Example Usage:
Suppose you have a complex nested object, and you want to update specific deep
fields within it. You can use `flatten()` to simplify the object, apply your
transformations, and then `unflatten()` to restore the original nested
structure.

```json
{
   "out": "(: unflatten( &(flatten($.) ($. -> get_new_deep_fields))) :)",
   "get_new_deep_fields": {
        "metadata.detail.origin_ip": "(: $original_ip :)",
        "metadata.detail.geohash": "(: $geohash :)"
    }
}
```

In this example:
1. `flatten($.)` simplifies the input object into a single-level object.
2. `get_new_deep_fields` transform creates or updates the specified deep fields.
3. `&()` merges the flattened original object with the new fields.
4. `unflatten()` restores the nested structure, integrating the new or updated fields.

### Unflattening Objects

The `unflatten()` helper reverses the flattening process, reconstructing the
original nested structure from a single-level object.

#### Example Result:
Given the original object and the transform above, if the `original_ip` and
`geohash` are "192.168.1.1" and "u4pruydqqvj", the output will be:

```json
{
   "metadata": {
        "detail": {
            "origin_ip": "192.168.1.1",
            "geohash": "u4pruydqqvj"
        }
        // ... other existing fields ...
    }
    // ... other existing fields ...
}
```

This methodology facilitates precise manipulations on complex nested objects
with ease, allowing for targeted updates or additions to the data structure
while preserving the original format.