forked from waja/action-debian-package
281 lines
11 KiB
Markdown
281 lines
11 KiB
Markdown
# fast-redact
|
||
|
||
very fast object redaction
|
||
|
||
[![Build Status](https://travis-ci.org/davidmarkclements/fast-redact.svg?branch=master)](https://travis-ci.org/davidmarkclements/fast-redact)
|
||
|
||
## Default Usage
|
||
|
||
By default, `fast-redact` serializes an object with `JSON.stringify`, censoring any
|
||
data at paths specified:
|
||
|
||
```js
|
||
const fastRedact = require('fast-redact')
|
||
const fauxRequest = {
|
||
headers: {
|
||
host: 'http://example.com',
|
||
cookie: `oh oh we don't want this exposed in logs in etc.`,
|
||
referer: `if we're cool maybe we'll even redact this`
|
||
}
|
||
}
|
||
const redact = fastRedact({
|
||
paths: ['headers.cookie', 'headers.referer']
|
||
})
|
||
|
||
console.log(redact(fauxRequest))
|
||
// {"headers":{"host":"http://example.com","cookie":"[REDACTED]","referer":"[REDACTED]"}}
|
||
```
|
||
|
||
## API
|
||
|
||
### `require('fast-redact')({paths, censor, serialize}) => Function`
|
||
|
||
When called without any options, or with a zero length `paths` array,
|
||
`fast-redact` will return `JSON.stringify` or the `serialize` option, if set.
|
||
|
||
#### `paths` – `Array`
|
||
|
||
An array of strings describing the nested location of a key in an object.
|
||
|
||
The syntax follows that of the EcmaScript specification, that is any JavaScript
|
||
path is accepted – both bracket and dot notation is supported. For instance in
|
||
each of the following cases, the `c` property will be redacted: `a.b.c`,`a['b'].c`,
|
||
`a["b"].c`, `a[``b``].c`. Since bracket notation is supported, array indices are also
|
||
supported `a[0].b` would redact the `b` key in the first object of the `a` array.
|
||
|
||
Leading brackets are also allowed, for instance `["a"].b.c` will work.
|
||
|
||
##### Wildcards
|
||
|
||
In addition to static paths, asterisk wildcards are also supported.
|
||
|
||
When an asterisk is place in the final position it will redact all keys within the
|
||
parent object. For instance `a.b.*` will redact all keys in the `b` object. Similarly
|
||
for arrays `a.b[*]` will redact all elements of an array (in truth it actually doesn't matter
|
||
whether `b` is in an object or array in either case, both notation styles will work).
|
||
|
||
When an asterisk is in an intermediate or first position, the paths following the asterisk will
|
||
be redacted for every object within the parent.
|
||
|
||
For example:
|
||
|
||
```js
|
||
const fastRedact = require('fast-redact')
|
||
const redact = fastRedact({paths: ['*.c.d']})
|
||
const obj = {
|
||
x: {c: {d: 'hide me', e: 'leave me be'}},
|
||
y: {c: {d: 'and me', f: 'I want to live'}},
|
||
z: {c: {d: 'and also I', g: 'I want to run in a stream'}}
|
||
}
|
||
console.log(redact(obj))
|
||
// {"x":{"c":{"d":"[REDACTED]","e":"leave me be"}},"y":{"c":{"d":"[REDACTED]","f":"I want to live"}},"z":{"c":{"d":"[REDACTED]","g":"I want to run in a stream"}}}
|
||
```
|
||
|
||
Another example with a nested array:
|
||
|
||
```js
|
||
const fastRedact = require('..')
|
||
const redact = fastRedact({paths: ['a[*].c.d']})
|
||
const obj = {
|
||
a: [
|
||
{c: {d: 'hide me', e: 'leave me be'}},
|
||
{c: {d: 'and me', f: 'I want to live'}},
|
||
{c: {d: 'and also I', g: 'I want to run in a stream'}}
|
||
]
|
||
}
|
||
console.log(redact(obj))
|
||
// {"a":[{"c":{"d":"[REDACTED]","e":"leave me be"}},{"c":{"d":"[REDACTED]","f":"I want to live"}},{"c":{"d":"[REDACTED]","g":"I want to run in a stream"}}]}
|
||
```
|
||
|
||
#### `remove` - `Boolean` - `[false]`
|
||
|
||
The `remove` option, when set to `true` will cause keys to be removed from the
|
||
serialized output.
|
||
|
||
Since the implementation exploits the fact that `undefined` keys are ignored
|
||
by `JSON.stringify` the `remove` option may *only* be used when `JSON.stringify`
|
||
is the serializer (this is the default) – otherwise `fast-redact` will throw.
|
||
|
||
If supplying a custom serializer that has the same behavior (removing keys
|
||
with `undefined` values), this restriction can be bypassed by explicitly setting
|
||
the `censor` to `undefined`.
|
||
|
||
|
||
#### `censor` – `<Any type>` – `('[REDACTED]')`
|
||
|
||
This is the value which overwrites redacted properties.
|
||
|
||
Setting `censor` to `undefined` will cause properties to removed as long as this is
|
||
the behavior of the `serializer` – which defaults to `JSON.stringify`, which does
|
||
remove `undefined` properties.
|
||
|
||
Setting `censor` to a function will cause `fast-redact` to invoke it with the original
|
||
value. The output of the `censor` function sets the redacted value.
|
||
Please note that asynchronous functions are not supported.
|
||
|
||
#### `serialize` – `Function | Boolean` – `(JSON.stringify)`
|
||
|
||
The `serialize` option may be a function of a boolean. If a function is supplied, this
|
||
will be used to `serialize` the redacted object. It's important to understand that for
|
||
performance reasons `fast-redact` *mutates* the original object, then serializes, then
|
||
restores the original values. So the object passed to the serializer is the exact same
|
||
object passed to the redacting function.
|
||
|
||
The `serialize` option as a function example:
|
||
|
||
```js
|
||
const fastRedact = require('fast-redact')
|
||
const redact = fastRedact({
|
||
paths: ['a'],
|
||
serialize: (o) => JSON.stringify(o, 0, 2)
|
||
})
|
||
console.log(redact({a: 1, b: 2}))
|
||
// {
|
||
// "a": "[REDACTED]",
|
||
// "b": 2
|
||
// }
|
||
```
|
||
|
||
For advanced usage the `serialize` option can be set to `false`. When `serialize` is set to `false`,
|
||
instead of the serialized object, the output of the redactor function will be the mutated object
|
||
itself (this is the exact same as the object passed in). In addition a `restore` method is supplied
|
||
on the redactor function allowing the redacted keys to be restored with the original data.
|
||
|
||
```js
|
||
const fastRedact = require('fast-redact')
|
||
const redact = fastRedact({
|
||
paths: ['a'],
|
||
serialize: false
|
||
})
|
||
const o = {a: 1, b: 2}
|
||
console.log(redact(o) === o) // true
|
||
console.log(o) // { a: '[REDACTED]', b: 2 }
|
||
console.log(redact.restore(o) === o) // true
|
||
console.log(o) // { a: 1, b: 2 }
|
||
```
|
||
|
||
#### `strict` – `Boolean` - `[true]`
|
||
The `strict` option, when set to `true`, will cause the redactor function to throw if instead
|
||
of an object it finds a primitive. When `strict` is set to `false`, the redactor function
|
||
will treat the primitive value as having already been redacted, and return it serialized (with
|
||
`JSON.stringify` or the user's custom `serialize` function), or as-is if the `serialize` option
|
||
was set to false.
|
||
|
||
## Approach
|
||
|
||
In order to achieve lowest cost/highest performance redaction `fast-redact`
|
||
creates and compiles a function (using the `Function` constructor) on initialization.
|
||
It's important to distinguish this from the dangers of a runtime eval, no user input
|
||
is involved in creating the string that compiles into the function. This is as safe
|
||
as writing code normally and having it compiled by V8 in the usual way.
|
||
|
||
Thanks to changes in V8 in recent years, state can be injected into compiled functions
|
||
using `bind` at very low cost (whereas `bind` used to be expensive, and getting state
|
||
into a compiled function by any means was difficult without a performance penalty).
|
||
|
||
For static paths, this function simply checks that the path exists and then overwrites
|
||
with the censor. Wildcard paths are processed with normal functions that iterate over
|
||
the object redacting values as necessary.
|
||
|
||
It's important to note, that the original object is mutated – for performance reasons
|
||
a copy is not made. See [rfdc](https://github.com/davidmarkclements/rfdc) (Really Fast
|
||
Deep Clone) for the fastest known way to clone – it's not nearly close enough in speed
|
||
to editing the original object, serializing and then restoring values.
|
||
|
||
A `restore` function is also created and compiled to put the original state back on
|
||
to the object after redaction. This means that in the default usage case, the operation
|
||
is essentially atomic - the object is mutated, serialized and restored internally which
|
||
avoids any state management issues.
|
||
|
||
## Caveat
|
||
|
||
As mentioned in approach, the `paths` array input is dynamically compiled into a function
|
||
at initialization time. While the `paths` array is vigourously tested for any developer
|
||
errors, it's strongly recommended against allowing user input to directly supply any
|
||
paths to redact. It can't be guaranteed that allowing user input for `paths` couldn't
|
||
feasibly expose an attack vector.
|
||
|
||
## Benchmarks
|
||
|
||
The fastest known predecessor to `fast-redact` is the non-generic [`pino-noir`](http://npm.im/pino-noir)
|
||
library (which was also written by myself).
|
||
|
||
In the direct calling case, `fast-redact` is ~30x faster than `pino-noir`, however a more realistic
|
||
comparison is overhead on `JSON.stringify`.
|
||
|
||
For a static redaction case (no wildcards) `pino-noir` adds ~25% overhead on top of `JSON.stringify`
|
||
whereas `fast-redact` adds ~1% overhead.
|
||
|
||
In the basic last-position wildcard case,`fast-redact` is ~12% faster than `pino-noir`.
|
||
|
||
The `pino-noir` module does not support intermediate wildcards, but `fast-redact` does,
|
||
the cost of an intermediate wildcard that results in two keys over two nested objects
|
||
being redacted is about 25% overhead on `JSON.stringify`. The cost of an intermediate
|
||
wildcard that results in four keys across two objects being redacted is about 55% overhead
|
||
on `JSON.stringify` and ~50% more expensive that explicitly declaring the keys.
|
||
|
||
```sh
|
||
npm run bench
|
||
```
|
||
|
||
```
|
||
benchNoirV2*500: 59.108ms
|
||
benchFastRedact*500: 2.483ms
|
||
benchFastRedactRestore*500: 10.904ms
|
||
benchNoirV2Wild*500: 91.399ms
|
||
benchFastRedactWild*500: 21.200ms
|
||
benchFastRedactWildRestore*500: 27.304ms
|
||
benchFastRedactIntermediateWild*500: 92.304ms
|
||
benchFastRedactIntermediateWildRestore*500: 107.047ms
|
||
benchJSONStringify*500: 210.573ms
|
||
benchNoirV2Serialize*500: 281.148ms
|
||
benchFastRedactSerialize*500: 215.845ms
|
||
benchNoirV2WildSerialize*500: 281.168ms
|
||
benchFastRedactWildSerialize*500: 247.140ms
|
||
benchFastRedactIntermediateWildSerialize*500: 333.722ms
|
||
benchFastRedactIntermediateWildMatchWildOutcomeSerialize*500: 463.667ms
|
||
benchFastRedactStaticMatchWildOutcomeSerialize*500: 239.293ms
|
||
```
|
||
|
||
## Tests
|
||
|
||
```
|
||
npm test
|
||
```
|
||
|
||
```
|
||
224 passing (499.544ms)
|
||
```
|
||
|
||
### Coverage
|
||
|
||
```
|
||
npm run cov
|
||
```
|
||
|
||
```
|
||
-----------------|----------|----------|----------|----------|-------------------|
|
||
File | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s |
|
||
-----------------|----------|----------|----------|----------|-------------------|
|
||
All files | 100 | 100 | 100 | 100 | |
|
||
fast-redact | 100 | 100 | 100 | 100 | |
|
||
index.js | 100 | 100 | 100 | 100 | |
|
||
fast-redact/lib | 100 | 100 | 100 | 100 | |
|
||
modifiers.js | 100 | 100 | 100 | 100 | |
|
||
parse.js | 100 | 100 | 100 | 100 | |
|
||
redactor.js | 100 | 100 | 100 | 100 | |
|
||
restorer.js | 100 | 100 | 100 | 100 | |
|
||
rx.js | 100 | 100 | 100 | 100 | |
|
||
state.js | 100 | 100 | 100 | 100 | |
|
||
validator.js | 100 | 100 | 100 | 100 | |
|
||
-----------------|----------|----------|----------|----------|-------------------|
|
||
```
|
||
|
||
## License
|
||
|
||
MIT
|
||
|
||
## Acknowledgements
|
||
|
||
Sponsored by [nearForm](http://www.nearform.com)
|