Tidy Function API

Tidy functions are the main function used in a tidy(...) flow. These are the primary functions to be used when wrangling data. They expect their input to either be a flat list of items.

tidy#

The main function that starts a tidy flow. Used to chain multiple tidy functions together and to smartly handle working with grouped data. The items it works with must be a flat list.

Parameters#

items#

object[]

The collection of data to work with, a flat array of items.

...tidyFns#

| (items: object[]) => object[]
| groupBy() => any /* with export options specified */

Any number of functions can be supplied to tidy which will be called as if in a pipeline: the output of function 1 is the input to function 2.

The typical case is (items: object[]) => object[], but groupBy may output something different if you specify an export option. If you do export from groupBy, it must be the last function called in the tidy flow.

Usage#

const data = [
{ str: 'foo', value: 3 },
{ str: 'foo', value: 3 },
{ str: 'foo', value: 1 },
{ str: 'bar', value: 3 },
{ str: 'bar', value: 1 },
{ str: 'bar', value: 1 },
{ str: 'foo', value: 3 },
{ str: 'bar', value: 7 },
];
tidy(
data,
distinct(['str', 'value']),
filter((d) => d.value <= 3),
summarize({ summedValue: sum('value') })
);
// output:
[{ summedValue: 8 }]

addItems / addRows#

Adds items to the end of a collection.

Parameters#

itemsToAdd#

| object
| object[]
| (items: object[]) => (object | object[])

The items to add to the collection or a function that resolves to items to add given the input set.

Usage#

const data = [{ a: 1 }, { a: 2 }];
tidy(data, addRows({ a: 3 }));
// output:
[{ a: 1 }, { a: 2 }, { a: 3 }]
tidy(data, addRows([{ a: 4 }]));
// output:
[{ a: 1 }, { a: 2 }, { a: 4 }]
tidy(data, addRows((items) => [{ a: items.length * 10 }]));
// output:
[{ a: 1 }, { a: 2 }, { a: 10 }]

arrange / sort#

Sorts items by the specified keys and comparators.

Parameters#

comparators#

| string /* key of item */
| ((a: object, b: object) => number)
| Array<string | ((a: object, b: object) => number)>

A key or set of keys of the item to sort by, or comparator functions that return -1, 0, or 1 if a < b, a == b, a > b respectively. You can mix and match keys and comparator functions when supplying an array.

For convenience, you can flip to descending order for keys by wrapping the key string with the desc(key: string) function. There is a corresponding asc(key: string) function for ascending data, but this is the default, so is unnecessary to use.

You can also sort the values for a key in a pre-specified order with the fixedOrder function:

fixedOrder(
key: string | ((d) => any),
order: string[],
options: { position: 'start' | 'end' })

Usage#

const data = [
{ str: 'foo', value: 3 },
{ str: 'foo', value: 4 },
{ str: 'bar', value: 2 },
{ str: 'bar', value: 1 },
{ str: 'bar', value: 5 },
];
tidy(data, arrange(['str', desc('value')])
// output:
[
{ str: 'bar', value: 5 },
{ str: 'bar', value: 2 },
{ str: 'bar', value: 1 },
{ str: 'foo', value: 4 },
{ str: 'foo', value: 3 },
]
tidy(data, arrange((a, b) => a.value - b.value));
// output:
[
{ str: 'bar', value: 1 },
{ str: 'bar', value: 2 },
{ str: 'foo', value: 3 },
{ str: 'foo', value: 4 },
{ str: 'bar', value: 5 },
]
tidy(data, arrange([fixedOrder('value', [5,4,1,3,2])]));
// output:
[
{ str: 'bar', value: 5 }, // <-- pinned to start
{ str: 'foo', value: 4 },
{ str: 'bar', value: 1 },
{ str: 'foo', value: 3 }, // <-- unsorted items
{ str: 'bar', value: 2 },
]
tidy(data,
arrange([
fixedOrder('value', [5,4,1], { position: 'end' })
]));
// output:
[
{ str: 'foo', value: 3 }, // <-- unsorted items
{ str: 'bar', value: 2 },
{ str: 'bar', value: 5 }, // <-- pinned to end
{ str: 'foo', value: 4 },
{ str: 'bar', value: 1 },
]

complete#

Complete a collection with missing combinations of data, can be useful for zero filling data. This is a convenience function that combines expand, leftJoin, and replaceNully.

Parameters#

expandKeys#

| string /* key of item */
| string[]
| { [key]: string | any[] | (items: object[]) => any[] }

The keys to expand the collection to have all combinations of. This can be specified as a single key string, an array of key strings or a key mapping object. The key mapping object maps from keys in the items to either:

  • { a: 'a' }: the key name itself. In this case, the values to use for the combinations will be derived from what is in the data currently.
  • { a: [1, 2, 3, 4] } an array of values denoting all possible values for this key, even if they do not occur in the data.
  • { a: fullSeq('a') } a function mapping from the items in the collection to an array of all possible values. This is typically used in combination with sequence helper functions like fullSeq.

replaceNullySpec#

{ [key]: any }

A map from key name to the value that nully values should be replaced with for that key. For example, given an objects of the shape {a: number, b: string}, the replaceNullySpec may look like { a: -1, b: 'n/a' }. Note you are not required to fill in values for all keysโ€“ any unspecified keys will keep their nully value.

Usage#

const data = [
{ a: 1, b: 'b1', c: 100 },
{ a: 2, b: 'b1', c: 200 },
{ a: 3, b: 'b1', c: 300 },
{ a: 1, b: 'b2', c: 101 },
{ a: 2, b: 'b2', c: 201 },
];
tidy(data, complete(['a', 'b'], { c: 0 }))
// output:
[
{ a: 1, b: 'b1', c: 100 },
{ a: 1, b: 'b2', c: 101 },
{ a: 2, b: 'b1', c: 200 },
{ a: 2, b: 'b2', c: 201 },
{ a: 3, b: 'b1', c: 300 },
{ a: 3, b: 'b2', c: 0 },
]

count#

Tallies the number distinct values for the specified keys and adds the count as a new key (default n). Optionally sorts by the count. This is a convenience wrapper around groupBy, tally, and optionally arrange.

Parameters#

groupKeys#

| string /* key of item */
| (item: object) => any
| Array<string | (item: object) => any>

The group keys to pass to groupBy. Either a key in the item or an accessor function that returns the grouping value, or an array combining these two options.

options#

{
name: string = 'n',
sort: boolean = false,
}
  • name = 'n': The name of the count value in the resulting items.
  • sort = false: Whether or not the resulting items should be sorted by the count key descending.

Usage#

const data = [
{ a: 1, b: 10, c: 100 },
{ a: 1, b: 10, c: 101 },
{ a: 2, b: 20, c: 200 },
{ a: 3, b: 30, c: 300 },
];
tidy(data, count('a'));
// output:
[{ a: 1, n: 2 }, { a: 2, n: 1 }, { a: 3, n: 1 }];
tidy(data, count('a', { name: 'count' }))
// output:
[
{ a: 1, count: 2 },
{ a: 2, count: 1 },
{ a: 3, count: 1 },
]

debug#

Logs to the console the current state of the data. For grouped data, each group will be output.

The data passes through unmodified.

Parameters#

label?#

| string

A label to display along with the debugged output.

options?#

{
limit?: number | null,
output?: 'log' | 'table'
}
  • limit = 10: When non-null, the output is limited to the first n items.
  • output = 'table': Switches between console.log or console.table as the output mechanism.

Usage#

const data = [
{ a: 1, b: 10, c: 100 },
{ a: 2, b: 20, c: 200 },
];
tidy(data, debug())
/*
[tidy.debug] ------------------------------------------------------------------
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”
โ”‚ (index) โ”‚ a โ”‚ b โ”‚ c โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 0 โ”‚ 1 โ”‚ 10 โ”‚ 100 โ”‚
โ”‚ 1 โ”‚ 2 โ”‚ 20 โ”‚ 200 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”˜
*/
tidy(data, debug('test label', { limit: 1 }))
/*
[tidy.debug] test label -------------------------------------------------------
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”
โ”‚ (index) โ”‚ a โ”‚ b โ”‚ c โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 0 โ”‚ 1 โ”‚ 10 โ”‚ 100 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”˜
*/

distinct#

Removes items with duplicate values for the specified keys. If no keys provided, uses strict equality for comparison. You may also think of this as reducing a dataset to just unique values for the specified columns.

Parameters#

keys#

| string /* key of item */
| (item: object) => any
| Array<string | (item: object) => any>

The set of keys or accessors to use to compare whether two items in a collection are equal.

Usage#

const data = [
{ str: 'foo', value: 1 },
{ str: 'foo', value: 3 },
{ str: 'far', value: 3 },
{ str: 'bar', value: 1 },
{ str: 'foo', value: 3 },
];
tidy(data, distinct(['str', 'value']))
// output:
[
{ str: 'foo', value: 1 },
{ str: 'foo', value: 3 },
{ str: 'far', value: 3 },
{ str: 'bar', value: 1 },
]
tidy(data, distinct([(d) => d.str[0], 'value']))
// output:
[
{ str: 'foo', value: 1 },
{ str: 'foo', value: 3 },
{ str: 'bar', value: 1 },
]

expand#

Expands a collection of items to include all combinations of the specified keys. Non-specified keys will be dropped.

Parameters#

expandKeys#

| string /* key of item */
| string[]
| { [key]: string | any[] | (items: object[]) => any[] }

The keys to expand the collection to have all combinations of. This can be specified as a single key string, an array of key strings or a key mapping object. The key mapping object maps from keys in the items to either:

  • { a: 'a' }: the key name itself. In this case, the values to use for the combinations will be derived from what is in the data currently.
  • { a: [1, 2, 3, 4] } an array of values denoting all possible values for this key, even if they do not occur in the data.
  • { a: fullSeq('a') } a function mapping from the items in the collection to an array of all possible values. This is typically used in combination with sequence helper functions like fullSeq.

Usage#

const data = [
{ a: 1, b: 'b1', c: 100 },
{ a: 2, b: 'b1', c: 200 },
{ a: 4, b: 'b1', c: 300 },
{ a: 1, b: 'b2', c: 101 },
{ a: 2, b: 'b2', c: 201 },
];
tidy(data, expand('a'));
// output:
[{ a: 1 }, { a: 2 }, { a: 4 }]
tidy(data, expand(['a', 'b']))
// output:
[
{ a: 1, b: 'b1' },
{ a: 1, b: 'b2' },
{ a: 2, b: 'b1' },
{ a: 2, b: 'b2' },
{ a: 4, b: 'b1' },
{ a: 4, b: 'b2' },
]
tidy(data, expand({ a: [1, 2, 3, 4, 5], b: 'b' }))
// output:
[
{ a: 1, b: 'b1' },
{ a: 1, b: 'b2' },
{ a: 2, b: 'b1' },
{ a: 2, b: 'b2' },
{ a: 3, b: 'b1' },
{ a: 3, b: 'b2' },
{ a: 4, b: 'b1' },
{ a: 4, b: 'b2' },
{ a: 5, b: 'b1' },
{ a: 5, b: 'b2' },
]
tidy(data, expand({ a: fullSeq('a') }));
// output:
[{ a: 1 }, { a: 2 }, { a: 3 }, { a: 4 }]

fill#

Fills values for the specified keys to match the last seen value in the collection.

Parameters#

keys#

| string /* key of item */
| string[]

The key or keys in the items to fill in. Only the specified keys will be affected.

Usage#

const data = [
{ a: 1, b: null, c: undefined, d: 1 },
{ a: null, b: 2, c: undefined },
{ a: null, c: 3, d: 3 },
{ a: 4, b: 4, c: 4, d: 4 },
{},
{ c: 6 },
{ c: 7, d: 7 },
];
tidy(data, fill(['a', 'b', 'c', 'd'])));
// output:
[
{ a: 1, b: null, c: undefined, d: 1 },
{ a: 1, b: 2, c: undefined, d: 1 },
{ a: 1, b: 2, c: 3, d: 3 },
{ a: 4, b: 4, c: 4, d: 4 },
{ a: 4, b: 4, c: 4, d: 4 },
{ a: 4, b: 4, c: 6, d: 4 },
{ a: 4, b: 4, c: 7, d: 7 }
]

filter#

Filters out items from the collection based on the filter fn, similar to Array.prototype.filter.

Parameters#

filterFn#

(item: object, index: number, array: object[]) => boolean

The predicate function to filter by: items are only kept if it returns true.

Usage#

const data = [{ value: 1 }, { value: 2 }, { value: 3 }];
tidy(data, filter((d) => d.value % 2 === 1))
// output:
[{ value: 1 }, { value: 3 }]

fullJoin#

Performs a full join on two collections of items.

Parameters#

itemsToJoin#

object[] /* the join dataset */

The collection of items to join.

options#

{
by?:
| string /* key in both datasets */
| string[]
| {
[string /* key in join */]: string /* key in original */
}
}

An options object specifying with the following options:

  • by The key (string) or keys (string[]) to join the two collections on. This form only works if both sets of data have the same column names. If you need to map more specifically, provide an object mapping from key in the original data set to key in the join dataset. Note that if by is not provided, then overlapping columns will be autodetected and used.

Usage#

tidy([{ a: 1, b: 2 }, { a: 2, b: 5 }],
fullJoin(
[{ a: 1, c: 3 }, { a: 4, c: 4 }],
{ by: 'a' }
)
)
// output:
[
{ a: 1, b: 2, c: 3 },
{ a: 2, b: 5 },
{ a: 4, c: 4 },
];
const data = [
{ a: 1, J: 'j', b: 10, c: 100 },
{ a: 1, J: 'k', b: 60, c: 600 },
{ a: 1, J: 'J', b: 30, c: 300 },
{ a: 2, J: 'j', b: 20, c: 200 },
{ a: 3, J: 'x', b: 50, c: 500 },
];
const joinData = [
{ a: 1, J: 'j', altJ: 'j', x: 'x1', y: 'y1' },
{ a: 1, J: 'J', altJ: 'J', x: 'x9', y: 'y9' },
{ a: 2, J: 'j', altJ: 'j', x: 'x2', y: 'y2' },
{ a: 2, J: 'X', altJ: 'x', x: 'x5', y: 'y5' },
];
tidy(data, fullJoin(joinData, { by: ['a', 'J'] }));
// output:
[
{ a: 1, J: 'j', altJ: 'j', b: 10, c: 100, x: 'x1', y: 'y1' },
{ a: 1, J: 'k', b: 60, c: 600 },
{ a: 1, J: 'J', altJ: 'J', b: 30, c: 300, x: 'x9', y: 'y9' },
{ a: 2, J: 'j', altJ: 'j', b: 20, c: 200, x: 'x2', y: 'y2' },
{ a: 3, J: 'x', b: 50, c: 500 },
{ a: 2, J: 'X', altJ: 'x', x: 'x5', y: 'y5' },
]
tidy(data, fullJoin(joinData, { by: { a: 'a', altJ: 'J' } }))
// output:
[
{ a: 1, J: 'j', altJ: 'j', b: 10, c: 100, x: 'x1', y: 'y1' },
{ a: 1, J: 'k', b: 60, c: 600 },
{ a: 1, J: 'J', altJ: 'J', b: 30, c: 300, x: 'x9', y: 'y9' },
{ a: 2, J: 'j', altJ: 'j', b: 20, c: 200, x: 'x2', y: 'y2' },
{ a: 3, J: 'x', b: 50, c: 500 },
{ a: 2, J: 'X', altJ: 'x', x: 'x5', y: 'y5' },
]

groupBy#

Restructures the data to be nested by the specified group keys then runs a tidy flow on each of the leaf sets. Grouped data can be exported into different shapes via group export helpers, or if not specified, will be ungrouped back to a flat list of items.

See the groupBy docs for details.

Usage#

const data = [
{ str: 'a', ing: 'x', foo: 'G', value: 1 },
{ str: 'b', ing: 'x', foo: 'H', value: 100 },
{ str: 'b', ing: 'x', foo: 'K', value: 200 },
{ str: 'a', ing: 'y', foo: 'G', value: 2 },
{ str: 'a', ing: 'y', foo: 'H', value: 3 },
{ str: 'a', ing: 'y', foo: 'K', value: 4 },
{ str: 'b', ing: 'y', foo: 'G', value: 300 },
{ str: 'b', ing: 'z', foo: 'H', value: 400 },
{ str: 'a', ing: 'z', foo: 'K', value: 5 },
{ str: 'a', ing: 'z', foo: 'G', value: 6 },
]
tidy(
data,
groupBy('str', [
summarize({ total: sum('value') })
])
)
// output:
[
{ str: 'a', total: 21 },
{ str: 'b', total: 1000 },
]
*/
tidy(
data,
groupBy(['str', 'ing'], [
summarize({ total: sum('value') })
])
)
// output:
[
{ str: 'a', ing: 'x', total: 1 },
{ str: 'a', ing: 'y', total: 9 },
{ str: 'a', ing: 'z', total: 11 },
{ str: 'b', ing: 'x', total: 300 },
{ str: 'b', ing: 'y', total: 300 },
{ str: 'b', ing: 'z', total: 400 },
]
*/

innerJoin#

Performs an inner join on two collections of items.

Parameters#

itemsToJoin#

object[] /* the join dataset */

The collection of items to join.

options#

{
by?:
| string /* key in both datasets */
| string[]
| {
[string /* key in join */]: string /* key in original */
}
}

An options object specifying with the following options:

  • by The key (string) or keys (string[]) to join the two collections on. This form only works if both sets of data have the same column names. If you need to map more specifically, provide an object mapping from key in the original data set to key in the join dataset. Note that if by is not provided, then overlapping columns will be autodetected and used.

Usage#

const data = [
{ a: 1, J: 'j', b: 10, c: 100 },
{ a: 1, J: 'k', b: 60, c: 600 },
{ a: 1, J: 'J', b: 30, c: 300 },
{ a: 2, J: 'j', b: 20, c: 200 },
{ a: 3, J: 'x', b: 50, c: 500 },
];
const joinData = [
{ a: 1, J: 'j', altJ: 'j', x: 'x1', y: 'y1' },
{ a: 1, J: 'J', altJ: 'J', x: 'x9', y: 'y9' },
{ a: 2, J: 'j', altJ: 'j', x: 'x2', y: 'y2' },
];
tidy(data, innerJoin(joinData, { by: ['a', 'J'] }));
// output:
[
{ a: 1, J: 'j', altJ: 'j', b: 10, c: 100, x: 'x1', y: 'y1' },
{ a: 1, J: 'J', altJ: 'J', b: 30, c: 300, x: 'x9', y: 'y9' },
{ a: 2, J: 'j', altJ: 'j', b: 20, c: 200, x: 'x2', y: 'y2' },
]
tidy(data, innerJoin(joinData, { by: { a: 'a', altJ: 'J' } }))
// output:
[
{ a: 1, J: 'j', altJ: 'j', b: 10, c: 100, x: 'x1', y: 'y1' },
{ a: 1, J: 'J', altJ: 'J', b: 30, c: 300, x: 'x9', y: 'y9' },
{ a: 2, J: 'j', altJ: 'j', b: 20, c: 200, x: 'x2', y: 'y2' },
]

leftJoin#

Performs a left join on two collections of items.

Parameters#

itemsToJoin#

object[] /* the join dataset */

The collection of items to join.

options#

{
by?:
| string /* key in both datasets */
| string[]
| {
[string /* key in join */]: string /* key in original */
}
}

An options object specifying with the following options:

  • by The key (string) or keys (string[]) to join the two collections on. This form only works if both sets of data have the same column names. If you need to map more specifically, provide an object mapping from key in the original data set to key in the join dataset. Note that if by is not provided, then overlapping columns will be autodetected and used.

Usage#

const data = [
{ a: 1, J: 'j', b: 10, c: 100 },
{ a: 1, J: 'k', b: 60, c: 600 },
{ a: 1, J: 'J', b: 30, c: 300 },
{ a: 2, J: 'j', b: 20, c: 200 },
{ a: 3, J: 'x', b: 50, c: 500 },
];
const joinData = [
{ a: 1, J: 'j', altJ: 'j', x: 'x1', y: 'y1' },
{ a: 1, J: 'J', altJ: 'J', x: 'x9', y: 'y9' },
{ a: 2, J: 'j', altJ: 'j', x: 'x2', y: 'y2' },
];
tidy(data, leftJoin(joinData, { by: ['a', 'J'] }));
// output:
[
{ a: 1, J: 'j', altJ: 'j', b: 10, c: 100, x: 'x1', y: 'y1' },
{ a: 1, J: 'k', b: 60, c: 600 },
{ a: 1, J: 'J', altJ: 'J', b: 30, c: 300, x: 'x9', y: 'y9' },
{ a: 2, J: 'j', altJ: 'j', b: 20, c: 200, x: 'x2', y: 'y2' },
{ a: 3, J: 'x', b: 50, c: 500 },
]
tidy(data, leftJoin(joinData, { by: { a: 'a', altJ: 'J' } }))
// output:
[
{ a: 1, J: 'j', altJ: 'j', b: 10, c: 100, x: 'x1', y: 'y1' },
{ a: 1, J: 'k', b: 60, c: 600 },
{ a: 1, J: 'J', altJ: 'J', b: 30, c: 300, x: 'x9', y: 'y9' },
{ a: 2, J: 'j', altJ: 'j', b: 20, c: 200, x: 'x2', y: 'y2' },
{ a: 3, J: 'x', b: 50, c: 500 },
]

map#

Maps items from one form to another, similar to Array.prototype.map.

Parameters#

mapFn#

(item: object, index: number, array: object[]) => object

Takes the current item and returns the new item.

Usage#

const data = [
{ value: 1, nested: { a: 10, b: 100 } },
{ value: 2, nested: { a: 20, b: 200 } },
];
tidy(data, map((d) => ({ value: d.value, ...d.nested }));
)
// output:
[
{ value: 1, a: 10, b: 100 },
{ value: 2, a: 20, b: 200 },
]

mutate#

Modify items by adding new columns/keys, or changing existing ones. This operation goes item by item, if you need to mutate with values across multiple items, use mutateWithSummary.

See item helpers for utility functions that help with common mutate operations.

Parameters#

mutateSpec#

{
[string /* (possibly new) key in mutated objects */]:
| (item: object) => any
| any
}

A specification showing how to modify values on the items.

If the mutate value is a function, it will be passed the an individual item at a time. All mutations specified happen on a single item before moving to the next item. For mutations that require computing across items, use mutateWithSummary.

mutate({ isEven: d => d.foo % 2 })
// items where foo is even have isEven true

If the mutate value is a single value, it will be assigned directly to each item.

mutate({ type: 'o' })
// all items have { type: 'o' }

If the mutate value is an array of values, it will be assigned directly to each item.

mutate({ type: ['o', 'a', 't'] })
// all items have { type: ['o', 'a', 't'] }

Note that the order of keys matters. Later keys can reference values from previous keys, e.g.:

mutate({
key: () => Math.random(),
key2: d => d.key * 5
}))

Usage#

const data = [
{ str: 'foo', value: 3 },
{ str: 'bar', value: 1 },
{ str: 'bar', value: 7 },
];
tidy(data, mutate({
x2: (d) => d.value * 2,
x4: (d) => d.x2 * 2,
constant: 99
}));
// output:
[
{ str: 'foo', value: 3, x2: 6, x4: 12, constant: 99 },
{ str: 'bar', value: 1, x2: 2, x4: 4, constant: 99 },
{ str: 'bar', value: 7, x2: 14, x4: 28, constant: 99 },
]

mutateWithSummary#

Modify items by adding new columns/keys, or changing existing ones. This operation can look across multiple items to produce values, which allows summarizations to be added (e.g. totals). If you only need to mutate individual items, use mutate.

See vector helpers and summarizers for utility functions that help with common mutateWithSummary operations.

Parameters#

mutateSummarySpec#

{
[string /* (possibly new) key in mutated objects */]:
| (items: object[]) => any | any[]
| any
| any[]
}

A specification showing how to modify values on the items.

For each key specified in the mutateSummarySpec, a vector of mutated values is computed then merged (immutably) back into the items before moving to the next key. If you want to mutate on a per-item basis, use mutate instead.

If the mutate value is a function, it will be passed the set of all items in the collection to run against, which allows efficient computing of things like means or totals across the entire dataset.

mutateWithSummary({
total: sum('value'),
rand: (items) => items.map(d => Math.random())
})
// all items have {
// total: <sum of value across all items>,
// rand: <different random number per item>
// }

If the mutate value is a single value, it will be assigned directly to each item.

mutateWithSummary({ type: 'o' })
// all items have { type: 'o' }

If the mutate value is an array of values, it will be assigned to the items using matching indices.

mutateWithSummary({ type: ['o', 'a', 't'] })
// items[0] has { type: 'o' },
// items[1] has { type: 'a' },
// items[2] has { type: 't' },

Note that the order of keys matters. Later keys can reference values from previous keys, e.g.:

mutateWithSummary({
total: sum('value'),
totalSquared: items => items[0].total * items[0].total
}))

Usage#

const data = [
{ str: 'foo', value: 3 },
{ str: 'bar', value: 1 },
{ str: 'bar', value: 7 },
];
tidy(data, mutateWithSummary({
total: sum('value'), // helper summary function 'sum'
}));
// output:
[
{ str: 'foo', value: 3, sum: 11 },
{ str: 'bar', value: 1, sum: 11 },
{ str: 'bar', value: 7, sum: 11 },
]

rename#

Rename keys in a collection.

Parameters#

renameSpec#

{ [oldKey: string]: string /* new key */ }

A mapping from the old key name to the new, renamed key, similar style to destructuring an object.

Usage#

const data = [
{ a: 1, b: 'b10', c: 100 },
{ a: 2, b: 'b20', c: 200 },
{ a: 3, b: 'b30', c: 300 },
];
tidy(data, rename({ b: 'newB', c: 'newC' }));
// output:
[
{ a: 1, newB: 'b10', newC: 100 },
{ a: 2, newB: 'b20', newC: 200 },
{ a: 3, newB: 'b30', newC: 300 },
]

replaceNully#

Replaces nully values with what is specified in the spec on a per-key basis.

Parameters#

replaceNullySpec#

object

A map from key name to the value that nully values should be replaced with for that key. For example, given an objects of the shape {a: number, b: string}, the replaceNullySpec may look like { a: -1, b: 'n/a' }. Note you are not required to fill in values for all keysโ€“ any unspecified keys will keep their nully value.

Usage#

const data = [
{ value: 1, foo: null, bar: '', x: 1 },
{ value: null, foo: undefined, bar: 'xx', x: 2 },
{ value: undefined, foo: 0, x: 3 },
];
tidy(data, replaceNully({ value: -1, foo: NaN, bar: 'N/A' }));
// output:
[
{ value: 1, foo: NaN, bar: '', x: 1 },
{ value: -1, foo: NaN, bar: 'xx', x: 2 },
{ value: -1, foo: 0, bar: 'N/A', x: 3 },
]

select / pick#

Select subparts of items. This function can be used to re-order keys or for selecting subselections of keys (similar to pick and omit from lodash).

See selectors for convenient ways specify keys to select.

Parameters#

selectKeys#

| string /* key of item */
| (items: T[]) => string[] /* keys of items */
| Array<string | (items: T[]) => string[]>

The keys, or functions that resolve to keys to select from the object. If a key is prefixed with -, it will be removed from the object. If the first argument passed begins with -, an implicit everything selector will be called first.

Usage#

const data = [
{ foo: 1, bar: 20, foobar: 300, FoObAR: 90, a: 'a1', b: 'b1' },
{ foo: 2, bar: 21, foobar: 301, FoObAR: 91, a: 'a2', b: 'b2' },
{ foo: 3, bar: 22, foobar: 302, FoObAR: 92, a: 'a3', b: 'b3' },
{ foo: 4, bar: 23, foobar: 303, FoObAR: 93, a: 'a4', b: 'b4' },
];
tidy(data, select(['a', startsWith('foo'), '-foobar']))
// output:
[
{ a: 'a1', foo: 1, FoObAR: 90 },
{ a: 'a2', foo: 2, FoObAR: 91 },
{ a: 'a3', foo: 3, FoObAR: 92 },
{ a: 'a4', foo: 4, FoObAR: 93 },
]

slice#

Selects a subset of the data, similar to Array.prototype.slice.

Parameters#

start#

number

The starting index to select from. The item at this index is included in the results.

end#

number

The ending index before which to end selecting. The item at this index is not included in the results.

Usage#

const data = [
{ value: 1 },
{ value: 2 },
{ value: 3 },
{ value: 4 },
{ value: 5 },
];
tidy(data, slice(1, 3))
// output:
[{ value: 2 }, { value: 3 }]

sliceHead#

Selects the first N items in the collection.

Parameters#

n#

number

The number of items to select.

Usage#

const data = [
{ value: 1 },
{ value: 2 },
{ value: 3 },
{ value: 4 },
{ value: 5 },
];
tidy(data, sliceHead(2))
// output:
[{ value: 1 }, { value: 2 }]

sliceTail#

Selects the last N items in the collection.

Parameters#

n#

number

The number of items to select.

Usage#

const data = [
{ value: 1 },
{ value: 2 },
{ value: 3 },
{ value: 4 },
{ value: 5 },
];
tidy(data, sliceTail(2))
// output:
[{ value: 4 }, { value: 5 }]

sliceMin#

Selects the minimum N items in the collection ordered by some comparators, similar to arrange.

Parameters#

n#

number

The number of items to select.

orderBy#

| string /* key of item */
| ((a: object, b: object) => number)
| Array<string | ((a: object, b: object) => number)>

A key or set of keys of the item to sort by, or comparator functions that return -1, 0, or 1 if a < b, a == b, a > b respectively. See arrange for details.

Usage#

const data = [
{ value: 3 },
{ value: 1 },
{ value: 4 },
{ value: 5 },
{ value: 2 },
];
tidy(data, sliceMin(2, 'value'))
// output:
[{ value: 1 }, { value: 2 }]

sliceMax#

Selects the maximum N items in the collection ordered by some comparators, similar to arrange.

Parameters#

n#

number

The number of items to select.

orderBy#

| string /* key of item */
| ((a: object, b: object) => number)
| Array<string | ((a: object, b: object) => number)>

A key or set of keys of the item to sort by, or comparator functions that return -1, 0, or 1 if a < b, a == b, a > b respectively. See arrange for details.

Usage#

const data = [
{ value: 3 },
{ value: 1 },
{ value: 4 },
{ value: 5 },
{ value: 2 },
];
tidy(data, sliceMax(2, 'value'))
// output:
[{ value: 5 }, { value: 4 }]

sliceSample#

Selects the a random sample of N items in the collection.

Parameters#

n#

number

The number of items to select.

options?#

{
replace?: boolean = false
}
  • replace = false: If true, samples items with replacement, otherwise without. If using with replacement, you can sample more than the items that are available.

Usage#

const data = [
{ value: 1 },
{ value: 2 },
{ value: 3 },
{ value: 4 },
{ value: 5 },
];
tidy(data, sliceSample(2))
// output:
[{ value: 4 }, { value: 1 }]
tidy(data, sliceSample(4, { replace: true }))
// output:
[{ value: 3 }, { value: 3 }, { value: 2 }, { value: 5 }]

summarize#

Takes a collection of items and reduces them to a single item, commonly used for computing averages or sums across a group or dataset.

Parameters#

summarizeSpec#

{
[string /* key in output */]: (items: object[]) => any
}

An object specifying how to compute the summarized values in the output. The output object matches the keys in this specification with their values set to the output of their respective functions in the spec. Typically the values make use of the provided summarizers, but can be anything.

For example:

tidy(
[{ value: 2 }, { value: 4 }],
summarize({
summed: sum('value'),
avg: mean('value')
}))
// output:
[{ summed: 6, avg: 3 }]

Note that keys not specified will be dropped from the output unless the rest option is provided.

options#

{
rest?: (key: string) => (items: object[]) => any
}
  • rest: When provided, all keys in the source objects that are not in the summarySpec will be resolved via the function that is provided. This is equivalent to specifying all of them in the summarySpec. Typically this is combined with first or last:
tidy(
[{ value: 2 }, { value: 4 }],
summarize(
{ summed: sum('value') },
{ rest: first }
))
// output:
[{ summed: 6, value: 2 }]

Usage#

const data = [
{ str: 'foo', value: 3 },
{ str: 'foo', value: 1 },
{ str: 'bar', value: 3 },
{ str: 'bar', value: 1 },
{ str: 'bar', value: 7 },
];
tidy(data, summarize({
summedValue: sum('value'),
secondValue: (items) => items[1].value
}))
// output:
[{ summedValue: 15, secondValue: 1 }]

summarizeAll#

A simpler form of summarize where all keys in the data are summarized via the specified function.

Parameters#

summaryFn#

(key: string) => (items: any[]) => any /* typically number */

The function to apply to each key in the source data to create the summarized output.

Usage#

const data = [
{ value2: 3, value: 3 },
{ value2: 4, value: 1 },
{ value2: 5, value: 3 },
{ value2: 1, value: 1 },
{ value2: 10, value: 7 },
];
tidy(data, summarizeAll(sum))
// output:
[{ value: 15, value2: 23 }]

summarizeAt#

A simpler form of summarize where the specified keys are summarized via the same specified function. All other keys are dropped.

Parameters#

keys#

Array<
| string /* keys in the object */
| (items: T[]) => string[]
>

The keys on which the summary function will be applied. You can either provide the keys directly as strings or you can use selectors, or a combination of the two.

summaryFn#

(key: string) => (items: any[]) => any /* typically number */

The function to apply to each key in the source data to create the summarized output.

Usage#

const data = [
{ str: 'foo1', value2: 3, value: 3 },
{ str: 'bar1', value2: 4, value: 1 },
{ str: 'baz1', value2: 5, value: 3 },
{ str: 'foo2', value2: 1, value: 1 },
{ str: 'bar2', value2: 10, value: 7 },
];
tidy(data, summarizeAt(['value', 'value2'], sum))
// output:
[{ value: 15, value2: 23 }]

summarizeIf#

A simpler form of summarize where the summary function is called on keys whose values pass the specified predicate.

Parameters#

predicateFn#

(vector: any[] /* array of single values */) => boolean

A function that given a vector of values for a key (e.g. items.map(item => item.value)), returns true if that key should be summarized.

summaryFn#

(key: string) => (items: any[]) => any /* typically number */

The function to apply to each key in the source data to create the summarized output.

Usage#

const data = [
{ str: 'foo1', value2: 3, value: 3 },
{ str: 'bar1', value2: 4, value: 1 },
{ str: 'baz1', value2: 5, value: 3 },
{ str: 'foo2', value2: 1, value: 1 },
{ str: 'bar2', value2: 10, value: 7 },
];
// if first value for a key is numeric, summarize that column
tidy(data, summarizeIf((vector) => Number.isFinite(vector[0]), sum)
// output:
[{ value: 15, value2: 23 }]

tally#

Tally is a wrapper that summarizes the data with n: counts the number of items (per group if grouped).

Parameters#

options#

{
name: string = 'n',
}
  • name = 'n': The name of the count value in the resulting items.

Usage#

const data = [
{ a: 1, b: 10, c: 100 },
{ a: 2, b: 20, c: 200 },
{ a: 3, b: 30, c: 300 },
];
tidy(data, tally());
// [{ n: 3 }]

total#

Convenience wrapper around summarize and mutate that appends a new summarized row to the end of the data. Typically used for computing totals.

Parameters#

summarizeSpec#

{
[string /* key in output */]: (items: object[]) => any
}

The same as summarize::summarizeSpec โ€“ an object specifying how to compute the summarized values in the output. In this case, the expectation is the keys match the input data.

mutateSpec#

{
[string /* (possibly new) key in mutated objects */]:
| (item: object) => any
| any
}

A specification showing how to modify values on the items. See mutate for details. Can be useful for setting a field that indicates this is a total row (e.g. { id: '__total' }).

Usage#

const data = [
{ str: 'foo', value: 3 },
{ str: 'foo', value: 1 },
{ str: 'bar', value: 3 },
{ str: 'bar', value: 1 },
{ str: 'bar', value: 7 },
];
tidy(data, total(
{ value: sum('value') },
{ str: 'total' }
))
// output:
[
{ str: 'foo', value: 3 },
{ str: 'foo', value: 1 },
{ str: 'bar', value: 3 },
{ str: 'bar', value: 1 },
{ str: 'bar', value: 7 },
{ str: 'total', value: 15 },
]

totalAll#

A simpler form of total, but uses summarizeAll instead of summarize.

Parameters#

summaryFn#

(key: string) => (items: any[]) => any /* typically number */

The function to apply to each key in the source data to create the summarized output.

mutateSpec#

{
[string /* (possibly new) key in mutated objects */]:
| (item: object) => any
| any
}

A specification showing how to modify values on the items. See mutate for details. Can be useful for setting a field that indicates this is a total row (e.g. { id: '__total' }).

Usage#

const data = [
{ value2: 3, value: 3 },
{ value2: 4, value: 1 },
{ value2: 5, value: 3 },
{ value2: 1, value: 1 },
{ value2: 10, value: 7 },
];
tidy(data, totalAll(
{ value: sum('value') },
{ str: 'total' }
))
// output:
[
{ value2: 3, value: 3 },
{ value2: 4, value: 1 },
{ value2: 5, value: 3 },
{ value2: 1, value: 1 },
{ value2: 10, value: 7 },
{ value: 15, value2: 23, str: 'total' },
]

totalAt#

A simpler form of total, but uses summarizeAt instead of summarize.

Parameters#

keys#

Array<
| string /* keys in the object */
| (items: T[]) => string[]
>

The keys on which the summary function will be applied. You can either provide the keys directly as strings or you can use selectors, or a combination of the two.

summaryFn#

(key: string) => (items: any[]) => any /* typically number */

The function to apply to each key in the source data to create the summarized output.

mutateSpec#

{
[string /* (possibly new) key in mutated objects */]:
| (item: object) => any
| any
}

A specification showing how to modify values on the items. See mutate for details. Can be useful for setting a field that indicates this is a total row (e.g. { id: '__total' }).

Usage#

const data = [
{ str: 'foo1', value2: 3, value: 3 },
{ str: 'bar1', value2: 4, value: 1 },
{ str: 'baz1', value2: 5, value: 3 },
{ str: 'foo2', value2: 1, value: 1 },
{ str: 'bar2', value2: 10, value: 7 },
];
tidy(data, totalAt(
['value', 'value2'],
sum,
{ str: 'total' }
))
// output:
[
{ str: 'foo1', value2: 3, value: 3 },
{ str: 'bar1', value2: 4, value: 1 },
{ str: 'baz1', value2: 5, value: 3 },
{ str: 'foo2', value2: 1, value: 1 },
{ str: 'bar2', value2: 10, value: 7 },
{ str: 'total', value: 15, value2: 23 },
]

totalIf#

A simpler form of total, but uses summarizeIf instead of summarize.

Parameters#

predicateFn#

(vector: any[] /* array of single values */) => boolean

A function that given a vector of values for a key (e.g. items.map(item => item.value)), returns true if that key should be summarized.

summaryFn#

(key: string) => (items: any[]) => any /* typically number */

The function to apply to each key in the source data to create the summarized output.

mutateSpec#

{
[string /* (possibly new) key in mutated objects */]:
| (item: object) => any
| any
}

A specification showing how to modify values on the items. See mutate for details. Can be useful for setting a field that indicates this is a total row (e.g. { id: '__total' }).

Usage#

const data = [
{ str: 'foo1', value2: 3, value: 3 },
{ str: 'bar1', value2: 4, value: 1 },
{ str: 'baz1', value2: 5, value: 3 },
{ str: 'foo2', value2: 1, value: 1 },
{ str: 'bar2', value2: 10, value: 7 },
];
tidy(data, totalIf(
(vector) => Number.isFinite(vector[0]),
sum,
{ str: 'total' }
))
// output:
[
{ str: 'foo1', value2: 3, value: 3 },
{ str: 'bar1', value2: 4, value: 1 },
{ str: 'baz1', value2: 5, value: 3 },
{ str: 'foo2', value2: 1, value: 1 },
{ str: 'bar2', value2: 10, value: 7 },
{ str: 'total', value: 15, value2: 23 },
]

transmute#

The same as mutate, except all keys are dropped except those specified to be mutated.

Parameters#

mutateSpec#

{
[string /* (possibly new) key in mutated objects */]:
| (item: object) => any
| any
}

A specification showing how to modify values on the items. See mutate for details.

Usage#

const data = [
{ str: 'foo', value: 3 },
{ str: 'bar', value: 1 },
{ str: 'bar', value: 7 },
];
tidy(data, transmute({
value_x2: (d) => d.value * 2,
value_x4: (d) => d.value_x2 * 2,
}));
// output:
[
{ value_x2: 6, value_x4: 12 },
{ value_x2: 2, value_x4: 4 },
{ value_x2: 14, value_x4: 28 },
]

when#

Conditionally runs a tidy subflow based on the result of a boolean or predicate function.

Parameters#

predicate#

| boolean
| (items: object[]) => boolean

When true, or the function results in true, the subflow is run, otherwise the input items are passed through unmodified.

fns#

Array<(items: object[]) => object[]>

Array of tidy functions to run on the input data when the predicate is true.

Usage#

const data = [{ x: 1 }, { x: 2 }, { x: 3 }];
tidy(data,
when(true, [
mutate({ y: 52 })
])
);
// output:
[{ x: 1, y: 52 }, { x: 2, y: 52 }, { x: 3, y: 52 }]
tidy(data,
when((items) => items.length === 2, [
mutate({ y: 52 })
])
);
// output:
[{ x: 1 }, { x: 2 }, { x: 3 }]

Last updated on by Peter Beshai