If a file on your computer is being used by a program to send information to someone, the answer isn't to destroy/randomize the file and break other applications, the answer is to not use the program that is sending your information somewhere.
This speaks to a more general need for user-friendly audit logs of which resources are accessed by which programs. I should be able to tell on any platform if Spotify called fopen on something in my documents folder.
I checked the Chromium source and for Linux they explicitly mention not being allowed to send it externally[0]; they hash it via SHA1, encode it as base64, and use that value.
Interestingly for Windows they pull the machine id from the registry[1] and (at first glance) it doesn't seem like they're doing any hashing. The raw value gets used.
Haven't checked if the value gets sent externally but based upon the comment on the Linux code I'd bet it's a yes.
> I checked the Chromium source and for Linux they explicitly mention not being allowed to send it externally[0]; they hash it via SHA1, encode it as base64, and use that value.
/etc/machine-id seems to be a random value. What purpose does hashing it before sending it do? The hash should still uniquely identify a machine. Am I missing something? Kind of makes me think that it's just done so people say "it's ok, because they're hashing it first, so it's secure!", while in reality hashing doesn't do anything to alleviate any concern.
The source code links https://www.freedesktop.org/software/systemd/man/machine-id.... mentions hashing using an "application-specific key", which would at least make it not correlatable between different apps (so $WEBSITE can't correlate machine IDs with $WELL_BEHAVED_APP 's machine IDs.)
But either I'm missing something, or Chromium is - it looks like it's straight up hashing the file and not actually using any application-specific keys!
And having read several rounds of "Device enrollment" phrases now, while the design document is non-public, I would hazard a guess that this is the essential components of enterprise device management.
Google hashes the ID before making use of it, so the actual ID remains disguised, but this absolutely would be necessary if they were trying to implement ChromeOS enterprise device management. (You need a unique identifier per enterprise machine, etc.)
Presumably they only care about this id file with respect to enterprise ChromeOS installations, since they make no effort at all to locate the file in any other location than the one.
It looks more like they simply don't care about reading the file in non-enterprise circumstances, since either the machine is enterprise-managed or it isn't, and as they only transmit hashes of the ID rather than the ID itself, they're in compliance with the FreeDesktop guidelines that require this file to be present:
Removing it just makes it more difficult to write legit programs that has use of such features while anything nefarious will be able to find other things to use as fingerprints, including hardware serials, MACs and their own fingerprint files spread across the filesystem in non-standard locations.
Unless the OS is meant to be built for privacy and has a goal to run every app in a sandbox where nothing is fingerprintable, removing easily available fingerprintes would be a disservice to all.
> The /etc/machine-id file contains the unique machine ID of the local system that is set during installation. The machine ID is a single newline-terminated, hexadecimal, 32-character, lowercase ID. When decoded from hexadecimal, this corresponds to a 16-byte/128-bit value.
> The machine ID is usually generated from a random source during system installation and stays constant for all subsequent boots. Optionally, for stateless systems, it is generated during runtime at early boot if it is found to be empty.
So if that works for "stateless systems", why can't all machines be "stateless systems"?
You might really want to track machines in your fleet, in a way that persists across reboots. Let's say you can access a machine remotely but you got all your ethernet cables tangled up, so you don't know which physical machine you SSHed into.
Or if you are being concerned about being tracked by a third party, you don't want this identifier to exist at all, even if it doesn't persist through reboots.
So at boot, one would run "erase-machine-id". Then create a random 30-character hexadecimal number. And then either set "the systemd.machine_id= kernel command line parameter" to it. Or pass it via "the option --machine-id= to systemd".
Edit: OK, so I created a Debian VM, noted /etc/machine-id, deleted it, and rebooted. And found that it was still missing.
Running systemd-machine-id-setup generated a new machine-id from the D-Bus machine ID. And it was the same as the initial one.
But I also see in man machine-id:
> The machine-id may also be set, for example when network booting, by setting the systemd.machine_id= kernel command line parameter or passing the option --machine-id= to systemd. A machine-id may not be set to all zeros.
What is it with the fetish for inflicting Truenaming in cyberspace?
It's incredibly annoying. It needlessly bloats digital footprints, and it creates an opportunity for exploitation by nefarious actors.
Leave the Truenaming to the User's that need it. It doesn't do any good being baked in by default. If they really need it, they'll figure out a way to implement it. If they definitely cannot afford it, and aren't aware it is there by default, you are doing more harm putting It in than you would be by leaving well enough alone.
Revisiting this, I agree that there's quite some "meh" about this. I mean, there's no way to really know how machines have and share identifiers. So one must assume that they have, and do. And deal with it.
VMs seem generally good enough. But then there's WebGL, which generates identifiers based on the host graphics system and guest virtual video driver. So all Debian VMs on a given host have the same identifier.
If it really matters, though, you gotta use different hardware.
Sure, but then that means that honest packages should document all files that the software can be configured to created automatically. Or at least, all but user-specified ones.
Bad title (and I know that fault lies with the source site). At first glance it reads as if the Devuan team is considering embracing/adopting machine IDs, which is counter to their philosophy, when in fact they are against unique identifiers.
Doesn't a MAC address lookup or disk UUID lookup provide similar fingerprinting capabilities? Even the contents of one's .bashrc file could be used for fingerprinting.
I mean, if you start blocking one thing, where do you stop?
Ideally you work in reverse and only need to grant access to things to which you want to allow access. Start from zero and let an application provide a manifest of what it wants to access. Ex:
* Open outbound TCP sockets
* Read from $HOME/.config/chromium
* Read from /etc/machine-id
It's not a new concept at all and there are multiple approaches for implementing things like this. Getting mass market adoption is sadly next to impossible.
The problem is still that the granularity is not right (except for users who simply want to trust the application). For example, when uploading a photo, I don't want to give Facebook access to my entire filesystem, just the photo that I click. And I don't want to give Facebook access to my camera indefinitely, just now.
It will require a lot of design to get security right without deteriorating the UX too much.
But I agree, it's better than simply blocking everything.
> For example, when uploading a photo, I don't want to give Facebook access to my entire filesystem, just the photo that I click. And I don't want to give Facebook access to my camera indefinitely, just now.
FYI that's exactly how iOS does it. When you choose a picture to upload from the camera roll, the target app only gains access to that one photo.
> I don't want to give Facebook access to my entire filesystem,
Android's systems of intents and changing how storage security works in Android Q will help with this somewhat. They start expressly prohibiting access to the full filesystem and images and other intent extras must be passed through the intent call, rather than a reference to it on the filesystem.
That sounds great. But what I really want is if the app does want access to the entire filesystem, then the OS will present the app with a sandboxed filesystem, instead of just blocking the app (causing the app to refuse to work, which is what will happen in practice).
Privacy concerns aside, using the boot device serial number might be a better solution as it can't be deleted or modified and survives reboots and reinstalls.
This line will find it easily.
Not mine, I simply put together the work of others adding only very small modifications. Needs smartctl (smartmontools package on Debian) which can be run only by root.
> The important properties of the machine UUID are that 1) it remains unchanged until the next reboot and 2) it is different for any two running instances of the OS kernel. That is, if two processes see the same UUID, they should also see the same shared memory, UNIX domain sockets, local X displays, localhost.localdomain resolution, process IDs, and so forth.
Because it's possible to forward things like D-Bus, the X11 $DISPLAY, etc. over the network, two processes might be aware of each other over such a connection but not be running on the same machine and therefore be unable to share resources. The machine ID lets them check for that, so you can properly handle things like "I'm going to send a message to the screensaver in my display to not activate, I don't care if it's the same machine" vs. "I'm going to send a message to the terminal in my display to open a new tab, but only if it's actually on the same machine, otherwise I should start a new terminal". (These days I think that definition should be updated to "container" instead of "kernel": if you're running separate logical machines inside the same kernel with separate PIDs etc., they should have separate machine IDs.)
systemd and (IIRC) cloud-init use it to run once-per-machine tasks on machines that could come from images: if you want to prep a number of machines in advance, do the install, then change the machine ID. At boot time, startup scripts will say "Oh, this machine ID has not been initialized yet" and do things, and then not do them on the next boot.
> These days I think that definition should be updated to "container" instead of "kernel": if you're running separate logical machines inside the same kernel with separate PIDs etc., they should have separate machine IDs.
Indeed. If that weren't the case, and the "same kernel" were enough, then things could just use /proc/sys/kernel/random/boot_id.
Because the machine-id is intended to be something that persists between reboots, it is necessarily something that would live in the filesystem, independent of the kernel.
Your described functionality could be accomplished with a FUSE filesystem, though.
However, that functionality would be problematic. Programs (like D-Bus) expect to be able to use it to identify whether 2 communicating processes are on the same host.
If it served different bullshit to each process, it would be entirely non-functional. (Sans returning the true ID to root, this is /proc/sys/kernel/random/uuid)
Perhaps instead, use a determined-at-boot value (as the machine-id(5) docs say is acceptable for stateless systems). If this is a kernel construct that isn't associated with a specific (PID?) namespace, then this would also be problematic, as different containers would be considered to be the same "host". (Sans returning the true ID to root, this is /proc/sys/kernel/random/boot_id)
It would still have to live somewhere in the filesystem, and have a userspace program load it in to the kernel, just as `utsname.nodename` is loaded from /etc/hostname.
Yes; that area of the filesystem can be readable only to root. It could also have other avenues of entry: it could come from the boot firmware via the kernel command line, or be in a device tree blob or whatever.
A word of warning. If you try this, make sure the last command is started on boot and runs before D-Bus! I completely forgot I had done this, and I just spent a few hours trying to figure out why my system was hanging on boot. It turns out that D-Bus reads /etc/machine-id on start-up, and naturally by design, it will wait until it receives data from the named pipe before proceeding with execution.