I think the big problem is it's a tool you usually reach for so rarely you never quite get the opportunity to really learn it well, so it always remains in that valley of despair where you know you should use it, but it's never intuitive or easy to use.
It's not unique in that regard. 'sed' is Turing complete[1][2], but few people get farther than learning how to do a basic regex substitution.
I was just going to say, jq is like sed in that I only use 1% of it 99% of the time, but unlike sed in that I'm not aware of any clearly better if less ubiquitous alternatives to the 1% (e.g., Perl or ripgrep for simple regex substitutions in pipelines because better regex dialects).
Closest I've come, if you're willing to overlook its verbosity and (lack of) speed, is actually PowerShell, if only because it's a bit nicer than Python or JavaScript for interactive use.
That’s interesting! Can you say a little more? I find jq’s syntax and semantics to be simple and intuitive. It’s mostly dots, pipes, and brackets. It’s a lot like writing shell pipelines imo. And I tend to use it in the same way. Lots of one-time use invocations, so I spend more time writing jq filters than I spend reading them.
I suspect my use cases are less complex than yours. Or maybe jq just fits the way I think for some reason.
I dream of a world in which all CLI tools produce and consume JSON and we use jq to glue them together. Sounds like that would be a nightmare for you.
I'm not GP, I use jq all the time, but I each time I use it I feel like I'm still a beginner because I don't get where I want to go on the first several attempts. Great tool, but IMO it is more intuitive to JSON people that want a CLI tool than CLI people that want a JSON tool. In other words, I have my own preconceptions about how piping should work on the whole thing, not iterating, and it always trips me up.
Here's an example of my white whale, converting JSON arrays to TSV.
That whole map and from entries throws it off. It's not a good use for what you're doing. tsv expects a bunch of arrays, whereas you're getting a bunch of objects (with the header also being one) and then converting them to arrays. That is an unnecessary step and makes it a little harder to understand.
Thanks for sharing, this is much better, though I actually think it is the perfect example to explain something that is brain-slippery about jq
look at $cols | $cols
my brain says hmm that's a typo, clearly they meant ; instead of | because nothing is getting piped, we just have two separate statements. Surely the assignment "exhausts the pipeline" and we're only passing null downstream
the pipelining has some implicit contextual stuff going on that I have to arrive at by trial and error each time since it doesn't fit in my worldview while I'm doing other shell stuff
Honestly both of those make me do the confused-dog-head-tilt thing. I'd go for something sexp based, perhaps with infix composition, map, and flatmap operators as sugar.
I'm often having trouble with figuring out in advance what the end result will be when processing an input array: an array of mapped objects or a series of self-contained JSON objects? Why? Which one is better? What if I would like to filter out some of the elements as part of the operation?
CEL looks interesting and useful, though it isn't common nor familiar imo (not for me at least). Quoting from https://github.com/google/cel-spec
# Common Expression Language
The Common Expression Language (CEL) implements common
semantics for expression evaluation, enabling different
applications to more easily interoperate.
## Key Applications
- Security policy: organizations have complex infrastructure
and need common tooling to reason about the system as a whole
- Protocols: expressions are a useful data type and require
interoperability across programming languages and platforms.
Funny that everyone is linking the tools they wrote for themselves to deal with this problem. I am no exception. I wrote one that just lets you write JavaScript. Imagine my surprise that this extremely naive implementation was faster than jq, even on large files.
It's because .json itself has so much useless cruft it's often annoying to deal with. I am forever indebted for younger self forcing me to learn Clojure. Most of the time I choose not even bother with JSON anymore - EDN semantically so much cleaner - it's almost twice compact (yet lossless), it's far more readable (quotes and commas are optional), and easier to work with structurally. These days I'd use borkdude/jet or babashka and then deal with data in Clojure REPL - there I can inspect it from all sorts of angles, it's far easier to group, sort, slice, dice, map and filter through it. One can even easily visualize the data using djblue/portal.
Why most people strangulate themselves with confusing jq operators unnecessarily, I would never understand. Clojure is not that hard, maybe learn some basics, it comes handy a lot. Even when your team doesn't have any Clojure code.
To fix this I recently made myself a tiny tool I called jtree that recursively walks json, spitting out one line per leaf. Each line is the jq selector and leaf value separated by "=".
No more fiddling around trying to figure out the damn selector by trying to track the indentation level across a huge file. Also easy to pipe into fzf, then split on "=", trim, then pass to jq
Like I did with regex some years earlier, I worked on a project for a few weeks that required constant interactions with jq, and through that I managed to lock in the general shape of queries so that my google hints became much faster.
Of course, this doesn't matter now, I just ask an LLM to make the query for me if it's so complex that I can't do it by hand within seconds.
LOL ... I can absolutely feel your pain. That's exactly why I created for myself a graphical approach. I shared the first version with friends and it turned into "ColumnLens" (ImGUI on Mac) app. Here is a use case from the healthcare industry: https://columnlens.com/industries/medical
Because the output you get can have hallucinations, which don’t happen with a deterministic tool. Furthermore, by getting the `jq` command you get something which is reusable, fast, offline, local, doesn’t send your data to a third-party, doesn’t waste a bunch of tokens, … Using an LLM to filter the data is worse in every metric.
I get that AI isn’t deterministic by definition, but IMHO it’s become the go-to response for a reason to not use AI, regardless of the use case.
I’ve never seen AI “hallucinate” on basic data transformation tasks. If you tell it to convert JSON to YAML, that’s what you’re going to get. Most LLMs are probably using something like jq to do the conversion in the background anyway.
AI experts say AI models don’t hallucinate, they confabulate.
Just because you haven't seen it hallucinate on these tasks doesn't mean it can't.
When I'm deciding what tool to use, my question is "does this need AI?", not "could AI solve this?" There's plenty of cases where its hard to write a deterministic script to do something, but if there is a deterministic option, why would you choose something that might give you the wrong answer? It's also more expensive.
The jq script or other script that an LLM generates is way easier to spot check than the output if you ask it to transform the data directly, and you can reuse it.
I already know that. That's why we have deterministic algorithms, to simplify that complexity.
You have much to learn, witty answers mean nothing here, particularly empty witty answers, which are no better than jokes. Maybe stand-up comedy is your call in life.
This is great, your final line summarizes my thoughts as well. When it comes to matters of faith your average Redditor and Hacker News commenter will heap scorn and derision on religious people for accepting things blindly without any proof, yet they will blindly accept what other people tell them is true, or now what an LLM says is true.
My co-founder and I met in high school, and we wanted the name to carry a sense of craft. Cardboard was always that material in school projects that was firm enough to hold structure but malleable enough to build almost anything out of. That balance of structure and flexibility felt like a good metaphor for what we're building.
Also we just thought it was a cool name and bought a bunch of domains... https://cardboard.mov is one of my favorites :)
Exactly my thought. MapQuest had a big print button for A to B directions in the late 90s, before Google even existed. I can't find the print button anywhere on their site today.
When I was a kid I sent a letter to Snapple telling them that they should make Snapple flavored popsicles. They sent me a nice letter telling me it was a good idea. I have not thought about it since. But I wonder if my letter directly lead to this disaster:
"Disaster on a stick
An attempt to erect the world’s largest popsicle in a city square ended with a scene straight out of a disaster film — but much stickier."
reply