output: ['[5,3,7]']
- program: '.[] | select(.id == "second")'
input: '[{"id": "first", "val": 1}, {"id": "second", "val": 2}]'
- output: '{"id": "second", "val": 2}'
+ output: ['{"id": "second", "val": 2}']
- title: "`arrays`, `objects`, `iterables`, `booleans`, `numbers`, `strings`, `nulls`, `values`, `scalars`"
jq uses the Oniguruma regular expression library, as do php,
ruby, TextMate, Sublime Text, etc, so the description here
- will focus on jq specifics.
+ will focus on jq specifics.
The jq regex filters are defined so that they can be used using
one of these patterns:
entries:
- title: "`test(val)`, `test(regex; flags)`"
body: |
-
+
Like `match`, but does not return match objects, only `true` or `false`
for whether or not the regex matches the input.
-
+
examples:
- program: 'test("foo")'
input: '"foo"'
input: '"xabcd" "ABC"'
output: true
true
-
+
- title: "`match(val)`, `match(regex; flags)`"
body: |
-
+
**match** outputs an object for each match it finds. Matches have
the following fields:
-
+
* `offset` - offset in UTF-8 codepoints from the beginning of the input
* `length` - length in UTF-8 codepoints of the match
* `string` - the string that it matched
* `captures` - an array of objects representing capturing groups.
-
+
Capturing group objects have the following fields:
-
+
* `offset` - offset in UTF-8 codepoints from the beginning of the input
* `length` - length in UTF-8 codepoints of this capturing group
* `string` - the string that was captured
* `name` - the name of the capturing group (or `null` if it was unnamed)
-
+
Capturing groups that did not match anything return an offset of -1
-
+
examples:
- program: 'match("(abc)+"; "g")'
input: '"abc abc"'
- output:
+ output:
- '{"offset": 0, "length": 3, "string": "abc", "captures": [{"offset": 0, "length": 3, "string": "abc", "name": null}]}'
- '{"offset": 4, "length": 3, "string": "abc", "captures": [{"offset": 4, "length": 3, "string": "abc", "name": null}]}'
- program: 'match("foo")'
output: ['{"offset": 0, "length": 3, "string": "foo", "captures": []}']
- program: 'match(["foo", "ig"])'
input: '"foo bar FOO"'
- output:
+ output:
- '{"offset": 0, "length": 3, "string": "foo", "captures": []}'
- '{"offset": 8, "length": 3, "string": "FOO", "captures": []}'
- program: 'match("foo (?<bar123>bar)? foo"; "ig")'
output:
- '{"offset": 0, "length": 11, "string": "foo bar foo", "captures": [{"offset": 4, "length": 3, "string": "bar", "name": "bar123"}]}'
- '{"offset": 12, "length": 8, "string": "foo foo", "captures": [{"offset": -1, "length": 0, "string": null, "name": "bar123"}]}'
-
+
- program: '[ match("."; "g")] | length'
input: '"abc"'
output: 3
-
-
+
+
- title: "`capture(val)`, `capture(regex; flags)`"
body: |
-
+
Collects the named captures in a JSON object, with the name
of each capture as the key, and the matched string as the
corresponding value.
-
+
examples:
- program: 'capture("(?<a>[a-z]+)-(?<n>[0-9]+)")'
input: '"xyzzy-14"'
- output: '{ "a": "xyzzy", "n": "14" }''
+ output: ['{ "a": "xyzzy", "n": "14" }']
- title: "`scan(regex)`, `scan(regex; flags)`"
- body: |
-
- Emit a stream of the non-overlapping substrings of the input
- that match the regex in accordance with the flags, if any
- have been specified. If there is no match, the stream is empty.
- To capture all the matches for each input string, use the idiom
- [ expr ], e.g. [ scan(regex) ].
-
- example:
- - program: 'scan("c")'
- input: '"abcdefabc"'
- output: '"c"'
- '"c"'
-
- - program: 'scan("b")'
- input: ("", "")
- output: '[]'
- '[]"'
-
- - title: "`split(regex)`, split(regex; flags)`"
- body: |
-
- For backwards compatibility, `split` emits an array of the strings
- corresponding to the successive segments of the input string after it
- has been split at the boundaries defined by the regex and any
- specified flags. The substrings corresponding to the boundaries
- themselves are excluded. If regex is the empty string, then the first
- match will be the empty string.
-
- `split(regex)` can be thought of as a wrapper around `splits(regex)`,
- and similarly for `split(regex; flags)`.
-
- example:
- - program: 'split(", *")'
- input: '"ab,cd, ef"`
- output: '["ab","cd","ef"]'
-
-
+ body: |
+
+ Emit a stream of the non-overlapping substrings of the input
+ that match the regex in accordance with the flags, if any
+ have been specified. If there is no match, the stream is empty.
+ To capture all the matches for each input string, use the idiom
+ `[ expr ]`, e.g. `[ scan(regex) ]`.
+
+ example:
+ - program: 'scan("c")'
+ input: '"abcdefabc"'
+ output: ['"c"', '"c"']
+
+ - program: 'scan("b")'
+ input: ("", "")
+ output: ['[]', '[]']
+
+ - title: "`split(regex)`, `split(regex; flags)`"
+ body: |
+
+ For backwards compatibility, `split` emits an array of the strings
+ corresponding to the successive segments of the input string after it
+ has been split at the boundaries defined by the regex and any
+ specified flags. The substrings corresponding to the boundaries
+ themselves are excluded. If regex is the empty string, then the first
+ match will be the empty string.
+
+ `split(regex)` can be thought of as a wrapper around `splits(regex)`,
+ and similarly for `split(regex; flags)`.
+
+ example:
+ - program: 'split(", *")'
+ input: '"ab,cd, ef"'
+ output: ['"ab","cd","ef"']
+
+
- title: "`splits(regex)`, splits(regex; flags)`"
- body: |
-
- These provide the same results as their `split` counterparts,
- but as a stream instead of an array.
-
- example:
- - program: 'splits(", *")'
- input: '("ab,cd", "ef, gh")`
- output:
- '"ab"'
- '"cd"'
- '"ef"'
- '"gh"'
-
+ body: |
+
+ These provide the same results as their `split` counterparts,
+ but as a stream instead of an array.
+
+ example:
+ - program: 'splits(", *")'
+ input: '("ab,cd", "ef, gh")'
+ output: ['"ab"', '"cd"', '"ef"', '"gh"']
+
- title: "`sub(regex; tostring)`"
-
- body: |
-
- Emit the string obtained by replacing the first match of regex in the
- input string with `tostring`, after interpolation. `tostring` should
- be a jq string, and may contain references to named captures. The
- named captures are, in effect, presented as a JSON object (as
- constructed by `capture`) to `tostring`, so a reference to a captured
- variable named "x" would take the form: "\(.x)".
-
- example:
- - program: 'sub("^[^a-z]*(?<x>[a-z]*).*")'
- input: '"123abc456"'
- output: '"ZabcZabc"'
-
-
+ body: |
+
+ Emit the string obtained by replacing the first match of regex in the
+ input string with `tostring`, after interpolation. `tostring` should
+ be a jq string, and may contain references to named captures. The
+ named captures are, in effect, presented as a JSON object (as
+ constructed by `capture`) to `tostring`, so a reference to a captured
+ variable named "x" would take the form: "\(.x)".
+
+ example:
+ - program: 'sub("^[^a-z]*(?<x>[a-z]*).*")'
+ input: '"123abc456"'
+ output: '"ZabcZabc"'
+
+
- title: "`gsub(regex; string)`"
-
- body: |
-
- `gsub` is like `sub` but all the non-overlapping occurrences of the regex are
- replaced by the string, after interpolation.
-
- example:
- - program: 'gsub("(?<x>.)[^a]*"; "+\(.x)-")'
-
- input: '"Abcabc"'
- output: '"+A-+a-"'
-
-
+ body: |
+
+ `gsub` is like `sub` but all the non-overlapping occurrences of the regex are
+ replaced by the string, after interpolation.
+
+ example:
+ - program: 'gsub("(?<x>.)[^a]*"; "+\(.x)-")'
+ input: '"Abcabc"'
+ output: '"+A-+a-"'
+
+
- title: Advanced features
body: |
Variables are an absolute necessity in most programming languages, but