No telemetry in the Rust compiler: metrics without betraying user privacy

kibwen · on Aug 1, 2023

For context, the author (estebank) is the current force behind the Rust compiler's excellent error messages.

To reiterate the lede, so it doesn't get buried: "I’m convinced that rustc should never have the capability to make network connections. [...] When it comes to Rust, users’ machines are (and should be) entirely under the users’ control."

(And I don't believe this post is a response or a rebuttal to anything in particular, other than possibly being in the news because of Go. Telemetry was considered for the Rust compiler pre-1.0 and rejected for privacy reasons, and AFAIK has never seriously been pursued since.)

SkyMarshal · on Aug 1, 2023

Tbf he didn't bury the lede, it's right in the second paragraph:

> I believe that no compiler needs to call home. But I also believe that we have a viable alternative if we reject the `tele` part of telemetry. Local metrics that are never sent anywhere can still give us valuable insight about how the Rust compiler is behaving.

Groxx · on Aug 1, 2023

Editing this to make it clear: the article proposes local collection that could later be submitted.

This is the right path! Lots of manual submission processes can use this to improve data and streamline stuff, and keeping it local solves nearly all privacy concerns.

But nothing manual will ever get good submission rates unless it's zero effort, because that effort stands directly in the path of people's actual goals when they hit an error. It's the worst possible time to ask for something.

---

>People saw ICEs on nightly and didn’t report them. Maybe they thought they were already fixed, maybe they didn’t know how to report them. Whatever the reason, nightly and beta failed to be an effective canary.

Unless you make it trivial, you're asking for a lot from people.

I can't speak to Rust in particular[1], but it's VERY common for "please report this issue! [link]" to be met with a login wall, long list of requirements and metadata requested, asks for detailed lists of what you did and how it's relevant and what you expected and...

yeah, of course people don't usually report these issues.

They rightfully realize it's asking for a major time investment both right now (presumably they were trying to do a thing and would prefer to continue doing the thing rather than learning your reporting system) and in the future (questions for further info on bug reports are common).

Make it as easy as the telemetry you want to add, and it sometimes actually happens. i.e. no login, as near to zero effort as is possible, gather all information for them, anonymize it for them, etc. If you can do it automatically, you have trivially proven that you can also do it manually, so do that.

Give people a pre-filled link with a one-click submit and an optional "contact me at [email]" field.

---

[1]: https://github.com/rust-lang/rust/issues/93701 implies they do this too. They made or already had a github account, have run multiple commands to gather info, filled in a templated text field by hand, and then they're asked to make a minimal reproduction and check if it still exists in nightly? Is it any surprise at all that people do not usually do this?

estebank · on Aug 1, 2023

I do want to make this as easy as possible. As part of that effort I am experimenting[1] with making filing an ICE report be as easy as "open ticket and attach this file". We do require a GitHub login because that is where we keep our issue tracker, and given the user base size I would have to expect that any alternative service would require some kind of login to cut down on spam.

[1]: https://github.com/rust-lang/rust/pull/114128

Groxx · on Aug 2, 2023

Frankly I think GitHub Issues is possibly the biggest barrier, and not just in UX terms, and it astonishes me that it's often the only (visible) option for so many major projects. I've watched multiple coworkers run across issues in other languages and systems (we don't use Rust, sadly), and when they get to the GitHub report they stop cold.

Submitting it there means making it literally public, and relatively highly visible. Can they do that with [internal software X]? Do they need to hide their internal username, which is all over the report? Are they publicly acting in the company's name by sticking it in a world-visible-and-google-indexed location with their work GitHub account? A good number don't even have a separate personal account - GitHub is big, but extremely far from ubiquitous.

These are not hypothetical examples either, I've had newer engs ask me about ^ multiple of those. I've seen more-senior-than-me people asking our legal department similar questions before clicking submit, and mostly getting non-answers two days later. I see almost every case including manual scrubbing of dozens or hundreds of lines of output to try to avoid these questions. With the increasing public pile-ons in GitHub lately, this is all likely to worsen.

All of that is coming from highly motivated people trying to improve things for themselves and possibly dozens or hundreds of others (else they just try to work around the crash). And they frequently stop.

Meanwhile everyone just clicks "report and clear" in Intellij error reports.

amluto · on Aug 1, 2023

I wonder if splitting the telemetry upload into a separate command could make sense. So rustc could panic and say “Run 'rustup report' to report this issue”, and the user could send the report at their option. (Or ‘rustup report 12345' where 12345 identifies this panic.)

This would be the ultimate opt-in - the user has to specifically request each report.

Groxx · on Aug 2, 2023

Honestly this sounds pretty good to me. Unlikely to be accidentally triggered, easy to explain and run, lets you do it later rather than having to save things before continuing, and provides a few ways to preview the data before submitting (e.g. temporarily run a webserver to locally view and possibly clean up the report, which can easily include large files, and just click submit there to send it with CORS... or bundle it into a file that can be dropped somewhere if that doesn't actually work).

JohnFen · on Aug 1, 2023

Why require them to open a ticket, though? Just allow an easy way for people to upload the file and be done with it. You should take care of whatever bookkeeping your process requires yourself.

I have no github account, for instance, and there's no chance that I'd create one just to provide this data.

davidhyde · on Aug 1, 2023

< "We do require a GitHub login because that is where we keep our issue tracker"

The problem with this is that you are de-anonymizing the user when you ask for this and the user may decide that proceeding is not worth it. Also, they may find that logging in is a pain. Think of those companies who ask you to login before you can unsubscribe from their news letter. It is unpleasant for the user who is already upset.

I think that the GP made some really good points and may well have adequately explained the following point you made. Re-quoting: < "Whatever the reason, nightly and beta failed to be an effective canary." I think its worth the effort to figure out the reason why nightly is not an effective canary and if it's as simple as the user requiring a github login and not wanting to go that far then that should be fairly simple to solve.

I have a suggestion that may or may not be feasible. Logins are one way to cut down on spam but not the only way. You don't have to have users directly posting to your issue tracker, you could use a website. The concerns you have about telemetry are absolutely on point so this service could simply be an anonymous dumping ground for copy-paste ICE compiler output that anyone, including bots could use. No automation from the user's machine, no telemetry, no metrics. The service would be responsible for despamming / deduplicating the submissions and posting to the github issue tracker using an account not associated with the original poster.

Anyway, you guys do superb work and really think that articles like this are important. Thanks for posting it.

LightFog · on Aug 1, 2023

‘Whatever the reason, nightly and beta failed to be an effective canary’ - Im also curious about this? It is typical to try to move quality ‘left’ - which would imply understanding this issue and fixing it. But the focus seems to be on adding more logging - which will by definition have less impact on the quality of public releases than improving the quality of beta/canary releases will.

Yoric · on Aug 1, 2023

The Element Web Matrix client does that. I'm convinced that we're losing 99% of bug reports by making it so complicated.

That being said, the Rust crowd is very different, so it's definitely worth a try!

yorwba · on Aug 1, 2023

I've never had to log in to send a Firefox or Thunderbird crash report to Mozilla. A look at what they're doing might be fruitful.

yjftsjthsd-h · on Aug 1, 2023

> as easy as "open ticket and attach this file"

Does the "open a ticket" step expand into "create an issue and then fill out an elaborate questionnaire"? For instance, https://github.com/rust-lang/rust/blob/master/.github/ISSUE_... isn't... bad for what it is, but it's definitely asking for enough work that it falls back into "yeah no maybe someday when I'm not trying to get my code to work" territory. (Asking to clarify; I think you might be suggesting to reduce to the form just being "got a ICE? paste your error here and hit submit" which would be good)

estebank · on Aug 1, 2023

The only mandatory fields in that form are the rustc version (that's printed as part of the error) and the backtrace. The file being generated includes both of those things.

> Asking to clarify; I think you might be suggesting to reduce to the form just being "got a ICE? paste your error here and hit submit" which would be good

I would do that once the new ICE storage mechanism is exercised enough for everyone to be confident in its behavior and enabled in the stable channel.

JohnFen · on Aug 1, 2023

> it's VERY common for "please report this issue! [link]" to be met with a login wall

Which unintentionally, but strongly, communicates that the devs don't actually want the telemetry data.

0xbadcafebee · on Aug 1, 2023

What privacy concerns? What the hell are they putting in the telemetry that the user should be afraid of being sent over an encrypted connection? Is my bank account number being included in the compiler metrics?

At this point I consider any advocate of "privacy" in a tech context to be an anti-intellectual cargo-cultist. Just look at the linked article. Absolutely not one single example of a privacy violation or its implications. They just waive their hands in the air and say "it's contentious" and "risk is too great". Let's completely avoid an actual discussion of privacy, and just assume that any information about a user (like their IP address) will immediately result in them being sent to a gulag, despite no evidence that this has ever happened, much less how often, how likely, ways to avoid it, or the benefits that are lost by not allowing for said insanely-unlikely-risk. It's people afraid of their own shadow that would rather live in a prepper shelter than deal with the real world.

Of course we should completely ignore the fact that this is developer software, and they, being technical people, have many means to disable specific outgoing network connections if they so choose. But better to be paranoid than rational, because "trust is key".

Groxx · on Aug 1, 2023

>What privacy concerns?

The core thing to understand here is that what constitutes their concerns is not your decision. It's theirs.

You don't know what they put in their code. Maybe they did put their bank account in a function name. The tool won't tell them they shouldn't have done that, and there's currently no reason why they shouldn't if they're just playing around.

Similarly, you don't know what they're doing or how doing that might expose them in ways they might care about. If my country decides to outlaw something and starts looking for mentions of it and jailing people, your decision yesterday could endanger me. This happens with some regularity, and you can't reliably predict the topics that get targeted.

I generally approve telemetry for tools because I know how useful it is. But I see no reason why that should be your choice instead of mine.

0xbadcafebee · on Aug 1, 2023

> Maybe they did put their bank account in a function name.

Thank you for proving my point - again, no rational debate here. Obviously nobody would ever make a function name their bank account number. And why would metrics include function names? Metrics are numbers, not logs.

> Similarly, you don't know what they're doing or how doing that might expose them in ways they might care about. If my country decides to outlaw something and starts looking for mentions of it and jailing people, your decision yesterday could endanger me. This happens with some regularity, and you can't reliably predict the topics that get targeted.

Yet another contrived example that has nothing to do with the subject at hand and has no rational basis for happening. A random country creates a random law that happens to outlaw a compiler metric? You think a country's gonna make software development illegal soon?

> I generally approve telemetry for tools because I know how useful it is. But I see no reason why that should be your choice instead of mine.

Nobody's taking away your choices! You don't want telemetry? Block the outgoing connections! Disable it in a config file! Use a friggin' VPN like all the other privacy-paranoid do!

Groxx · on Aug 1, 2023

"Telemetry" frequently includes more than low-cardinality metrics. This article (and my root comment) mentions crash information, for example, which could easily include function names and other user information (username is often part of file paths, which are often part of crash information).

And the example is really not contrived. For example, homosexuality has waffled between "get married, love is love" and "your government is explicitly trying to kill you" repeatedly around the world. Congratulations! You worked on Grindr two years ago when they started their internet dragnet, as they can see by information coming from your compiler, on your machine, at your home! You now have to flee the country.

They're not going to be looking for your compiler crash logs, they're looking at everything they can possibly collect for keywords, and oopsie, there's one.

You (you personally and the "royal" you) cannot predict what will matter in the future, and your legitimate-threats do not match everyone else's legitimate-threats. I agree privacy-extremists exist, but so do other kinds of extremists and extreme situations. Ignorance of those doesn't mean they don't exist.

0xbadcafebee · on Aug 1, 2023

Crash reports are not telemetry, they are debugging/error reporting tools. Every application that has ever had a crash reporting function, did so by prompting the user to submit it first. The reason is pretty obvious: crash reports contain more data than a user might like to submit, and so the user gets to choose to submit it or not. Often they do choose to submit it, but again, they get the choice, and don't need to go through an awkward process to manually copy their offline data into an online process. Nobody has stopped using operating systems and browsers just because they have a crash reporter that can optionally submit the crash report to an internet site.

Telemetry is not sensitive data. Telemetry is "signals". For desktop software, telemetry is often what operating system, cpu architecture, and software version you're running. Your OS reports that, your browser reports that, every self-updating application you use reports that. This is not controversial nor sensitive data and has no privacy implication.

estebank · on Aug 1, 2023

For crash reports the only elements included are coming from the compiler (the symbols of the stack trace do not leak user data), but the panic description can leak symbols from the user's code. For automated mechanisms you could make it so that nothing user generated is sent, but a lot of consideration is needed to not make a mistake.

LightFog · on Aug 1, 2023

Some apps collect full memory dumps - on Windows at least there is nothing stopping them. That will indeed contain your bank details.

JohnFen · on Aug 1, 2023

We have had decades of companies demanding, sneaking, and exfiltrating data from us regardless of our wishes. Companies that put a great deal of effort even into bypassing various methods users have made to put a stop to this.

Collection of any data about me, my machines, or my use of my machines without my informed consent is spying. How sensitive the data actually is is beside the point.

In any case, so much good will has been burned now that it's not unreasonable to be very sensitive about any data collection whatsoever. People being reluctant to share isn't a case of people being unreasonable, it's an understandable response to so many years of abuse and violation of trust.

nindalf · on Aug 1, 2023

> What privacy concerns?

Most HN people are anti-telemetry and tracking. I've never actually seen these privacy concerns being fleshed out with specific details of how logs could be abused but it really doesn't matter if a real mechanism for abuse exists. What matters is people fear that it may exist and stop using the tool as a result.

I suggest taking a look at the thread suggesting telemetry in the Go toolchain (https://news.ycombinator.com/item?id=34707583). You can judge for yourself if there's legitimate concerns being surfaced, but there's no doubt that there's significant opposition to such the proposal.

Rust is getting ahead of this discussion by making it clear that their log collection won't phone home and it won't be automated.

krono · on Aug 1, 2023

Sometimes the subject of a person's concerns exists beyond their own ego. This is perfectly normal and surprisingly common behaviour.

weinzierl · on Aug 1, 2023

"I’m convinced that rustc should never have the capability to make network connections."

[..]

"[..]user information should never leave their machine in an automated manner"

Sounds reasonable. The only wish I have is that we should always have the option to download a version without the metrics part as well.

So tele never and metry with an easy, clear, reliable and sustainable opt-out.

richardwhiuk · on Aug 1, 2023

Out of curiosity, what's your justification for wanting opt out for local logging/metrics, to the extent you need a separate build?

weinzierl · on Aug 1, 2023

Primarily because I don't want this to be a slippery slope where the metry of today becomes the telemetry of tomorrow. As long as these features are easy to rip out and not deeply intermingled with the rest I'm good and willing to contribute metrics even.

Also, no code is free, it needs to be written and maintained and the end-user needs to download, store and run it and it brings an additional risk to break things. I just wish there was to possibility to opt out of that, even if I probably wouldn't.

Finally, if this metrics code is intermingled with the rest, I can see this as an additional burden for a potential certification of the compiler.

richardwhiuk · on Aug 1, 2023

Logs and metrics have a horrible tendency to be a cross cutting concern, sadly - you normally need to add instrumentation throughout the codebase to get good metrics.

tareqak · on Aug 1, 2023

I am thinking out loud here: what would be the comparable word here? topometry?

https://en.m.wiktionary.org/wiki/topo-

jjav · on Aug 1, 2023

Having a command line option to enable metrics for that run (with the default being off, of course) is a good solution.

weinzierl · on Aug 2, 2023

I'd prefer a version where all the non-essentials are missing over a switch.

Thaxll · on Aug 1, 2023

Go tried the same, and chose to let the user opt-in.

https://github.com/golang/go/discussions/58409

The privacy design for Go was actually sound.

https://research.swtch.com/telemetry-design

wrs · on Aug 1, 2023

The paradigm of local logging plus the ability to analyze locally for known questions/concerns and give the user a one-click way to easily, voluntarily report to the maintainers makes lots of sense. You do lose the ability to learn about unknown concerns, but seems like a good 80% solution. This is the same as “sorry your program crashed, would you like to report it?” but for a wider variety of issues/concerns than crashing.

zer8k · on Aug 1, 2023

I don't like even the whisper of anything that resembles telemetry.

Can someone explain the counter argument? A compiler is as low level as it goes. There is an assumed level of competence with someone using a compiler. Why is it better to collect data than just have the user cat some logs somewhere and post them?

After the recent debacle(s) with the rust foundation I am almost certain a "local only" telemetry will morph into VS Code's full system colonoscopy. Please do not do this. Find another way. Compilers were bug fixed without telemetry just fine for 30 years. The Rust Foundation are by and large bad actors in an otherwise okay ecosystem. Don't let them oracle-ize it.

estebank · on Aug 1, 2023

> Why is it better to collect data than just have the user cat some logs somewhere and post them?

Those logs are collected data. I'm arguing for those logs to exist.

> After the recent debacle(s) with the rust foundation

This has nothing to do with the foundation. I have nothing to do with the foundation.

> I am almost certain a "local only" telemetry

"Local-only telemetry" is an oxymoron.

zer8k · on Aug 1, 2023

The foundation controls Rust in the same way Oracle controls Java. This link is not tenuous. Are you asserting that the compiler team would never be able to be compelled to act against the interests of its users? Furthermore, that there's no way this could happen?

> Those logs are collected data. I'm arguing for those logs to exist.

Concrete example: I run into an error. I post a minimal example to a bug system. Is it not possible for the compiler team to run that example and pull their own special data using an instrumented compiler? How will these logs improve it?

estebank · on Aug 1, 2023

> The foundation controls Rust in the same way Oracle controls Java.

That is not quite correct. The Rust Foundation is an entity that manages the trademark, relationships with other legal entities, and handles money to pay for running costs. The technical part of the house is the somewhat amorphous "Rust Project", which you can check here https://www.rust-lang.org/governance.

> Concrete example: I run into an error. I post a minimal example to a bug system. Is it not possible for the compiler team to run that example and pull their own special data using an instrumented compiler? How will these logs improve it?

Transient errors exist, particularly when incremental compilation comes into play. In other cases the bug/problem only manifests in a closed source project and the reporter cannot produce a minimal reproducer. There are platforms that the people triaging and fixing bugs do not have access to and might have diverging behavior.

zer8k · on Aug 1, 2023

> Transient errors exist, particularly when incremental compilation comes into play. In other cases the bug/problem only manifests in a closed source project and the reporter cannot produce a minimal reproducer. There are platforms that the people triaging and fixing bugs do not have access to and might have diverging behavior.

Fair, I will need to do more reading on this strategy of gathering data. I did not consider closed source.

arp242 · on Aug 1, 2023

A rather long list of use cases (for Go) is at: https://research.swtch.com/telemetry-uses

It doesn't take that much imagination to come up with useful use cases. I'm sure there are many for Rust, too.

Linux is even more low-level, and telemetry would undoubtable be useful there too. "We don't break userspace" has always been interpreted flexibly – for example it's fine to break userpspace if no one will notice – or rather: if it's expected that no one will notice. But it's almost impossible to say if anyone will notice, and breaking changes are frequently rolled back because it did break stuff, causing wasted time and frustration from all sides.

Whether telemetry should be done is a different discussion, involving upsides, downsides, and trade-offs. But that it's useful for almost all software with significant widespread usage is hard to deny.

hedgehog · on Aug 1, 2023

I am highly privacy-conscious, but also rely on the robustness of tools including the Rust + LLVM compiler stack. That robustness is in part built on the high degree of scrutiny that comes with many other people using the same tools in many different ways, and reporting the problems they find. Tools to make reporting easier will help it happen more often.

mbo · on Aug 1, 2023

I work on a company-internal compiler. We collect analytics on what kinds of syntax and type errors engineers have been encountering, compile times, and crash reports.

They've been invaluable - we've squashed some gnarly performance issues with some degenerate constructs, caught a few performance regressions post-release, and spent time polishing the type errors and semantics of the ones most encountered.

thih9 · on Aug 1, 2023

I’m not clear on the details. E.g.:

> Without user action the data would never go anywhere.

What is the user action?

I wish this was worded in a more specific way. Like: without user’s opt in the data would never go anywhere. Or: without user’s explicit consent (type “yes” to share data) the data would not go anywhere. Is either correct here?

estebank · on Aug 1, 2023

I'm not against having some kind of service that users can install and enable for telemetry using these metrics, but metrics would be useful even if we never do anything like that and they only ever exist on the users' machine: "Thank you for your ticket! Can you check your metrics logs if x, y and z are present?"

thaunatos · on Aug 1, 2023

Later in the post:

> I could picture a future where the Compiler Team realizes we have an important question about the behavior of the software, that could be answered by analyzing these metrics, and have a VSCode or cargo plugin that analyzes these metrics locally, and if relevant information is found in these logs an appropriate summary is presented to the user for them to file a ticket.

remexre · on Aug 1, 2023

It sounds like "without the user using a separate not-installed-by-default program, they wouldn't get uploaded anywhere" to me?

wolrah · on Aug 1, 2023

> What is the user action?

The concept described is that the compiler would generate these metrics locally and store them for a limited period of time, but there would be no built in method for those metrics to leave the system they were generated on.

The user would then have that data available for their own use and could choose to manually share it or independently install a service to automatically share it per their own needs and desires.

So basically collecting the metrics but with no phone-home method built in. The user action to provide consent for access to the data is actually sharing the data.

david2ndaccount · on Aug 1, 2023

I doubt this will be well received, mostly due to the author’s problematic framing of the problem. This is not telemetry, it is just additional local logging. I don’t understand why it’s being framed as telemetry. Just say rustc is (optionally) logging more by default and the compiler people are writing tools that can analyze those logs if you choose to upload them.

Separately it sounds like rustc needs to do a better job collecting crash data that can be easily attached to a bug ticket.

estebank · on Aug 1, 2023

> This is not telemetry, it is just additional local logging.

I agree.

> I don’t understand why it’s being framed as telemetry. Just say rustc is (optionally) logging more by default and the compiler people are writing tools that can analyze those logs if you choose to upload them.

Any kind of metrics mechanism can look like the start of telemetry. I wanted to preempt that misconception by explicitly stating the difference, before any conversation or implementation occurs.

> Separately it sounds like rustc needs to do a better job collecting crash data that can be easily attached to a bug ticket.

I agree.

agilob · on Aug 1, 2023

I'm kind of a privacy freak too, but I actually don't would not and do not mind telemetry in OSS that are transparent about it. I send metrics about archlinux, plasma, firefox or syncthing: check this out https://data.syncthing.net/

I wouldn't mind transparent and honest telemetry in other tools like rustc or even vim. I don't see losing anything by sending my screen resolution and version of Xorg or Wayland. How is my privacy betrayed by this? I don't see what am I losing here, how am I betrayed, but I understand that these metrics will help make the tools better, give data to intelligent developers what to work on, what features are used more frequently.

I think the word telemetry is associated with bad vibes thanks to Microsoft and Google abusing telemetry, having it opt-out by default or giving users no control over it. For opensource tools we should use another word like "Automated User Feedback", or something else without the baggage.

amiga386 · on Aug 1, 2023

> we should use another word like "Automated User Feedback", or something else without the baggage.

Edward Bernays, propagandist, famously coined the term "Public Relations" to give propaganda a new name without the baggage the Germans gave it. https://youtu.be/DnPmg0R1M04?t=486

Telemetry, and the "baggage" associated with it, is the right word. We could be _more_ pejorative and call it "spying" or "tattling", but "telemetry" will do.

yjftsjthsd-h · on Aug 1, 2023

> having it opt-in by default or giving users no control over it.

I think you want "opt-out"?

> For opensource tools we should use another word like "Automated User Feedback", or something else without the baggage.

No, because some projects will immediately re-poison the well by using "Automated User Feedback" as a transparent euphemism for the same bad practices (and in fact, if I read that I would immediately assume it was nothing but a euphemism meant to mislead me). Might as well just say "opt-in telemetry" or maybe "opt-in telemetry with extra steps to make it less identifiable" if that's what we mean.

mike_hock · on Aug 1, 2023

No, we should not invent a euphemism to dodge the bad associations with the real thing.

There's no such thing as "being transparent" about telemetry in OSS because I don't feel like reading a README for every single command that I apt install to see if it has anti-features.

JohnFen · on Aug 1, 2023

> I wouldn't mind transparent and honest telemetry in other tools like rustc or even vim

Me neither, as long as it is voluntary and requires an action on my part for it to happen.

miki123211 · on Aug 1, 2023

Unlike Go, many Rust installs happen not through a package manager, but through Rustup. That's a perfect avenue to add useful opt-in telemetry, while making the privacy fundamentalists happy. If somebody is installing Rust interactively through Rustup, ask them whether they want telemetry or not, if they're installing from anywhere else, assume they don't. It feels like the perfect compromise for Rust, there's no way to have telemetry on without specifically agreeing to it, but there's a large enough percentage of users who would see that prompt to still make the limited opt-in data useful.

nixpulvis · on Aug 1, 2023

I stopped after the first paragraph (don’t have the headspace to really unload my thoughts here) but,

This could be a great idea. Local metrics in the know, then shared naturally between people. Sounds almost like how things got done before the internet. Imagine that!

I look forward to reading the rest of this and hopefully agreeing with at least some of it ;)

It can be a bad look when you start collecting too much information, makes it seem like something is wrong. But I also 100% see the issue. Having watched a few scenes die, it can be really sad.

Programming languages are especially tough, because they are so near to permanent it’s tempting to actually need to hold on to them.

telios · on Aug 1, 2023

Something tangentially related, but something I'm unaware of - if I do run into an ICE on nightly, what is the current process for reporting that? I've run into one or two before (and I've generally assumed it was me doing something dumb with the environment, and I couldn't replicate it), but I wouldn't mind trying to report them in the future.

estebank · on Aug 2, 2023

You would report it on our issue tracker. Here is an example of what that would look like:

https://github.com/rust-lang/rust/issues/114324

Even if you can't reproduce it, the backtrace can be enough to figure out the problem, or at least let's us know how often the issue has happened.

LightFog · on Aug 1, 2023

Telemetry has a place for product managers but it is a crutch when used for quality - it is heavily lagging and noisy, introducing it has a pull to the ‘right’ in quality gating. This isn’t a webservice with uptime and recovery requirements - it seems better to optimise processes around catching issues before public release.

workingjubilee · on Aug 1, 2023

The answer to that is simple: We do.

We try to catch things before public release.

We run UNSPEAKABLE amounts of regression tests.

We ask people to test unstable things.

Almost no one actually does test the unstable things. There's too much friction.

Almost no one actually gives us feedback on them. There's too much friction.

Things land on stable, people FINALLY actually mass-use them, the stable thing turns out to be busted, and THEN people overcome the friction to complain because they (rightly) fear it is going to be unfixable. Sometimes it is. Most of these have been very minor so far, and only affected things easily rolled back, but there will come a day when it is not so minor and not so easy to roll back. And nothing we can do in simulation or stress-testing on our end can replace contact with actual programs.

We're not making a typical "product" here. You can't think of it in typical product terms. We're making a programming language here. A contract, effectively, though not a legal one, as it is itself a system of expressing contracts. We cannot simply roll back things that are part of the language's contract in our stable releases if something is flawed, unless the flaw is SO dire as to result in one of the handful of situations that allows us to break it anyways.

LightFog · on Aug 1, 2023

Thanks for the extra context - and just in case my comment came across as very negative or blanket anti-telemetry, I think you are building an awesome tool and the sensitive approach to actually handling metrics seems totally reasonable. Hopefully they end up being useful here - I’ve just seen them not move the needle much on release quality relative to the various nuisances in handling them, although for sure in a different domain.

zamalek · on Aug 1, 2023

There's also a great resource on how people are using rust already: crates.io.

estebank · on Aug 1, 2023

That resource tells us what the code people use looks like, but it doesn't tell us how rustc behaves with it. Crater lets us tell if we've introduced a regression on features people use, but it doesn't tell us for example that compiling tokio got slower or faster, or that there's a high prevalence of code with large types, so we should focus on optimizing those. It also doesn't tell us how rustc behaves on code that is being modified, which will be incomplete/broken/weird. That kind of code is never committed to a repo.

zamalek · on Aug 1, 2023

> It also doesn't tell us how rustc behaves on code that is being modified, which will be incomplete/broken/weird.

Hmmm, good point. It also wouldn't tell you what someone did before giving up because they would never commit code.

Still, it's a useful data point. You could, say, time the build for the top 100 crates once a week, or gather metrics on language features that they use.

optimalsolver · on Aug 1, 2023

[flagged]

tsion · on Aug 1, 2023

You should try reading articles before commenting.

Scratch that, you should try reading the headline before commenting.

You should at least read the first two words of the headline.

How far must the bar be lowered?