Devuan considers machine IDs

nickysielicki · on March 30, 2019

If a file on your computer is being used by a program to send information to someone, the answer isn't to destroy/randomize the file and break other applications, the answer is to not use the program that is sending your information somewhere.

mirimir · on March 30, 2019

Sure, but how do you know what programs are misusing it?

btown · on March 30, 2019

This speaks to a more general need for user-friendly audit logs of which resources are accessed by which programs. I should be able to tell on any platform if Spotify called fopen on something in my documents folder.

auslander · on March 30, 2019

SELinux or app-armor, keep it in enforcing mode.

nickysielicki · on March 30, 2019

I think this is largely possible with eBPF, if you cared enough.

yellowapple · on April 1, 2019

Rename it and see which programs complain :)

jlgaddis · on March 30, 2019

  auditd(8)

mirimir · on March 30, 2019

Thanks.

https://security.blogoverflow.com/2013/01/a-brief-introducti...

0815test · on March 30, 2019

Or sandbox the program so that it sees a "safe" version of that information.

JohnFen · on March 31, 2019

But what faster way to discover which applications to remove than by deleting /etc/machine-id?

ez7r6i · on March 30, 2019

Chromium reads it, but are we sure it's sending it somewhere? Maybe it uses it for bookkeeping of local sessions or something like that.

koolba · on March 30, 2019

I checked the Chromium source and for Linux they explicitly mention not being allowed to send it externally[0]; they hash it via SHA1, encode it as base64, and use that value.

Interestingly for Windows they pull the machine id from the registry[1] and (at first glance) it doesn't seem like they're doing any hashing. The raw value gets used.

Haven't checked if the value gets sent externally but based upon the comment on the Linux code I'd bet it's a yes.

[0]: https://github.com/chromium/chromium/blob/aae20fb7d3616de40e...

[1]: https://github.com/chromium/chromium/blob/01a03aab2d89c93c15...

jolmg · on March 30, 2019

> I checked the Chromium source and for Linux they explicitly mention not being allowed to send it externally[0]; they hash it via SHA1, encode it as base64, and use that value.

/etc/machine-id seems to be a random value. What purpose does hashing it before sending it do? The hash should still uniquely identify a machine. Am I missing something? Kind of makes me think that it's just done so people say "it's ok, because they're hashing it first, so it's secure!", while in reality hashing doesn't do anything to alleviate any concern.

MaulingMonkey · on March 30, 2019

> Am I missing something?

The source code links https://www.freedesktop.org/software/systemd/man/machine-id.... mentions hashing using an "application-specific key", which would at least make it not correlatable between different apps (so $WEBSITE can't correlate machine IDs with $WELL_BEHAVED_APP 's machine IDs.)

But either I'm missing something, or Chromium is - it looks like it's straight up hashing the file and not actually using any application-specific keys!

JohnFen · on March 31, 2019

> they explicitly mention not being allowed to send it externally[0]; they hash it via SHA1, encode it as base64, and use that value.

If it's sending that value out, then there is no logical difference between that and just sending out the machine-id in the first place.

floatingatoll · on March 30, 2019

/etc/machine-id reading was implemented in service of the Chromium "Enterprise" component:

https://bugs.chromium.org/p/chromium/issues/detail?id=812641

Comment 26 implements Linux support for reading the machine's unique identifier:

https://chromium.googlesource.com/chromium/src.git/+/15dc90a...

That patch is Linux-only support in service of the greater patch:

https://chromium.googlesource.com/chromium/src.git/+/81a7040...

And having read several rounds of "Device enrollment" phrases now, while the design document is non-public, I would hazard a guess that this is the essential components of enterprise device management.

Google hashes the ID before making use of it, so the actual ID remains disguised, but this absolutely would be necessary if they were trying to implement ChromeOS enterprise device management. (You need a unique identifier per enterprise machine, etc.)

Presumably they only care about this id file with respect to enterprise ChromeOS installations, since they make no effort at all to locate the file in any other location than the one.

It looks more like they simply don't care about reading the file in non-enterprise circumstances, since either the machine is enterprise-managed or it isn't, and as they only transmit hashes of the ID rather than the ID itself, they're in compliance with the FreeDesktop guidelines that require this file to be present:

https://www.freedesktop.org/software/systemd/man/machine-id....

JdeBP · on March 31, 2019

See http://jdebp.uk./Softwares/nosh/guide/commands/machine-id.xm...

swinglock · on March 30, 2019

Removing it just makes it more difficult to write legit programs that has use of such features while anything nefarious will be able to find other things to use as fingerprints, including hardware serials, MACs and their own fingerprint files spread across the filesystem in non-standard locations.

Unless the OS is meant to be built for privacy and has a goal to run every app in a sandbox where nothing is fingerprintable, removing easily available fingerprintes would be a disservice to all.

JohnFen · on March 31, 2019

> Removing it just makes it more difficult to write legit programs

Yeah, that's something I simply could not care less about.

That said, I don't remove it. I set its permissions so that it isn't world-readable instead.

swinglock · on April 4, 2019

mirimir · on March 30, 2019

OK, from the man page:

> The /etc/machine-id file contains the unique machine ID of the local system that is set during installation. The machine ID is a single newline-terminated, hexadecimal, 32-character, lowercase ID. When decoded from hexadecimal, this corresponds to a 16-byte/128-bit value.

> The machine ID is usually generated from a random source during system installation and stays constant for all subsequent boots. Optionally, for stateless systems, it is generated during runtime at early boot if it is found to be empty.

So if that works for "stateless systems", why can't all machines be "stateless systems"?

dmurray · on March 30, 2019

It's not clear that would make anyone happy.

You might really want to track machines in your fleet, in a way that persists across reboots. Let's say you can access a machine remotely but you got all your ethernet cables tangled up, so you don't know which physical machine you SSHed into.

Or if you are being concerned about being tracked by a third party, you don't want this identifier to exist at all, even if it doesn't persist through reboots.

I agree there are other solutions in both cases.

patrickg_zill · on March 30, 2019

So basically deleting the file as the last step before a shut down or reboot will work, is that correct?

JdeBP · on March 31, 2019

Incorrect.

Machine IDs are stored in several places, some of which are not even files.

* http://jdebp.uk./Softwares/nosh/guide/commands/machine-id.xm...

They can be resurrected if not all storage locations are dealt with. Moreover, some systems use things like the SMBIOS product UUID.

* http://jdebp.uk./Softwares/nosh/guide/commands/setup-machine...

The correct approach is not deleting files.

* http://jdebp.uk./Softwares/nosh/guide/commands/erase-machine...

* https://lists.debian.org/debian-user/2019/03/msg00550.html

mirimir · on March 31, 2019

Thanks.

So at boot, one would run "erase-machine-id". Then create a random 30-character hexadecimal number. And then either set "the systemd.machine_id= kernel command line parameter" to it. Or pass it via "the option --machine-id= to systemd".

JdeBP · on March 31, 2019

No.

    % system-control cat machine-id
    start:#!/bin/nosh
    start:true
    stop:#!/bin/nosh
    stop:envdir env
    stop:erase-machine-id
    run:#!/bin/nosh
    run:#Set up and tear down the machine ID
    run:envdir env
    run:setup-machine-id
    restart:#!/bin/sh
    restart:exec false      # ignore script arguments
    %

mirimir · on March 31, 2019

OK, thanks.

So "setup-machine-id". Does that generate a value that's unrelated to any preexisting versions, analogs, etc?

I ask because, when I deleted /etc/machine-id and ran "systemd-machine-id-setup", it generated a new machine-id by copying the D-Bus machine ID.

mirimir · on March 30, 2019

I'll test that.

Edit: OK, so I created a Debian VM, noted /etc/machine-id, deleted it, and rebooted. And found that it was still missing.

Running systemd-machine-id-setup generated a new machine-id from the D-Bus machine ID. And it was the same as the initial one.

But I also see in man machine-id:

> The machine-id may also be set, for example when network booting, by setting the systemd.machine_id= kernel command line parameter or passing the option --machine-id= to systemd. A machine-id may not be set to all zeros.

patrickg_zill · on March 30, 2019

Does something like

dd if=/dev/urandom bs=1 count=16 | hexdump

Give you something that you can use?

mirimir · on March 30, 2019

Well, the machine-id of that Debian VM was ...

    38d05397c25548b4f4bda7751b5062

... and ...

    $ FOO=`cat /dev/urandom | tr -dc a-z0-9 | head -c${1:-30}`
    $ echo $FOO
      6we4gvnmx00w208ffty6i11m82rw6d

But I have no clue whether systemd would be happy with that. Maybe later I can test more.

Edit: Oops. Make that ...

    $ FOO=`cat /dev/urandom | tr -dc abcdef0-9 | head -c${1:-30}`
    $ echo $FOO
      b486935dbb9e9fe603328a19e2b5b4

justinclift · on March 31, 2019

Maybe use pwgen?

Something like:

    $ pwgen -1s 31
    qHfKU46H2RA2WUr0EZ1zBHfIBLKZKuT

mirimir · on March 31, 2019

I think that it needs to be hexadecimal. But not sure.

justinclift · on March 31, 2019

No worries. :)

tinus_hn · on March 30, 2019

Because then Google doesn’t have a persistent identifier to track you, of course.

salawat · on March 30, 2019

What is it with the fetish for inflicting Truenaming in cyberspace?

It's incredibly annoying. It needlessly bloats digital footprints, and it creates an opportunity for exploitation by nefarious actors.

Leave the Truenaming to the User's that need it. It doesn't do any good being baked in by default. If they really need it, they'll figure out a way to implement it. If they definitely cannot afford it, and aren't aware it is there by default, you are doing more harm putting It in than you would be by leaving well enough alone.

jackewiehose · on March 30, 2019

Why would anyone worry that applications like chrome abuse that file? If chrome wants a unique identifier it could generate it itself.

mirimir · on March 30, 2019

Revisiting this, I agree that there's quite some "meh" about this. I mean, there's no way to really know how machines have and share identifiers. So one must assume that they have, and do. And deal with it.

VMs seem generally good enough. But then there's WebGL, which generates identifiers based on the host graphics system and guest virtual video driver. So all Debian VMs on a given host have the same identifier.

If it really matters, though, you gotta use different hardware.

mirimir · on March 30, 2019

Because the machine-id by default never changes after OS installation.

swinglock · on March 30, 2019

Neither does a file Chrome generates. Not even if reinstalled.

mirimir · on March 30, 2019

Huh?

Even if you do

    $ sudo apt-get -y purge chrome

[or whatever its package name is]?

And if necessary, find and delete everything that it created.

pantalaimon · on March 30, 2019

apt doesn't know about any file the application might have ever created in your home directory.

jhardy54 · on March 30, 2019

> And if necessary, find and delete everything that it created.

mirimir · on March 30, 2019

Actually, apt does a pretty good job at finding stuff. Sometimes it can't delete, but it warns you about that.

Evidlo · on March 30, 2019

Apt only knows about the files that are listed in the package.

mirimir · on March 30, 2019

Sure, but then that means that honest packages should document all files that the software can be configured to created automatically. Or at least, all but user-specified ones.

JohnFen · on March 31, 2019

In Linux, I can find and nuke everything that Chrome (or any other app) generates.

morganvachon · on March 30, 2019

Bad title (and I know that fault lies with the source site). At first glance it reads as if the Devuan team is considering embracing/adopting machine IDs, which is counter to their philosophy, when in fact they are against unique identifiers.

amelius · on March 30, 2019

Doesn't a MAC address lookup or disk UUID lookup provide similar fingerprinting capabilities? Even the contents of one's .bashrc file could be used for fingerprinting.

I mean, if you start blocking one thing, where do you stop?

koolba · on March 30, 2019

Ideally you work in reverse and only need to grant access to things to which you want to allow access. Start from zero and let an application provide a manifest of what it wants to access. Ex:

* Open outbound TCP sockets * Read from $HOME/.config/chromium * Read from /etc/machine-id

It's not a new concept at all and there are multiple approaches for implementing things like this. Getting mass market adoption is sadly next to impossible.

amelius · on March 30, 2019

That's basically how smartphones do it.

The problem is still that the granularity is not right (except for users who simply want to trust the application). For example, when uploading a photo, I don't want to give Facebook access to my entire filesystem, just the photo that I click. And I don't want to give Facebook access to my camera indefinitely, just now.

It will require a lot of design to get security right without deteriorating the UX too much.

But I agree, it's better than simply blocking everything.

amaccuish · on March 30, 2019

> For example, when uploading a photo, I don't want to give Facebook access to my entire filesystem, just the photo that I click. And I don't want to give Facebook access to my camera indefinitely, just now.

FYI that's exactly how iOS does it. When you choose a picture to upload from the camera roll, the target app only gains access to that one photo.

qmarchi · on March 30, 2019

> I don't want to give Facebook access to my entire filesystem,

Android's systems of intents and changing how storage security works in Android Q will help with this somewhat. They start expressly prohibiting access to the full filesystem and images and other intent extras must be passed through the intent call, rather than a reference to it on the filesystem.

amelius · on March 30, 2019

That sounds great. But what I really want is if the app does want access to the entire filesystem, then the OS will present the app with a sandboxed filesystem, instead of just blocking the app (causing the app to refuse to work, which is what will happen in practice).

SeriousM · on March 30, 2019

Oh, chromium may break? Then I use an alternative. This is true for any other program.

hedora · on March 30, 2019

Or, devuan could arrange to always use 0xfoad or something appropriate if the file is missing.

Even better, it looks like there has been a file that always has the same value for a while; presumably they can just keep that in place.

squarefoot · on March 31, 2019

Privacy concerns aside, using the boot device serial number might be a better solution as it can't be deleted or modified and survives reboots and reinstalls.

This line will find it easily. Not mine, I simply put together the work of others adding only very small modifications. Needs smartctl (smartmontools package on Debian) which can be run only by root.

smartctl -i `df -P / | tail -n 1 | awk '/.*/ { print $1 }'` | egrep ^"Serial Number:" | awk '{print $3}'

gvand · on March 30, 2019

Why is it there in the first place?

geofft · on March 30, 2019

https://dbus.freedesktop.org/doc/dbus-uuidgen.1.html has some explanation, in particular:

> The important properties of the machine UUID are that 1) it remains unchanged until the next reboot and 2) it is different for any two running instances of the OS kernel. That is, if two processes see the same UUID, they should also see the same shared memory, UNIX domain sockets, local X displays, localhost.localdomain resolution, process IDs, and so forth.

Because it's possible to forward things like D-Bus, the X11 $DISPLAY, etc. over the network, two processes might be aware of each other over such a connection but not be running on the same machine and therefore be unable to share resources. The machine ID lets them check for that, so you can properly handle things like "I'm going to send a message to the screensaver in my display to not activate, I don't care if it's the same machine" vs. "I'm going to send a message to the terminal in my display to open a new tab, but only if it's actually on the same machine, otherwise I should start a new terminal". (These days I think that definition should be updated to "container" instead of "kernel": if you're running separate logical machines inside the same kernel with separate PIDs etc., they should have separate machine IDs.)

systemd and (IIRC) cloud-init use it to run once-per-machine tasks on machines that could come from images: if you want to prep a number of machines in advance, do the install, then change the machine ID. At boot time, startup scripts will say "Oh, this machine ID has not been initialized yet" and do things, and then not do them on the next boot.

LukeShu · on March 30, 2019

> These days I think that definition should be updated to "container" instead of "kernel": if you're running separate logical machines inside the same kernel with separate PIDs etc., they should have separate machine IDs.

Indeed. If that weren't the case, and the "same kernel" were enough, then things could just use /proc/sys/kernel/random/boot_id.

JdeBP · on March 31, 2019

See http://jdebp.uk./Softwares/nosh/guide/commands/machine-id.xm...

kazinator · on March 30, 2019

How about making it a symlink to a kernel feature:

   /etc/machine-id -> /proc/some/path/machine-id

this fictitious proc entry that I just invented serves up bullshit content to unprivileged processes, but a true ID to the superuser.

LukeShu · on March 30, 2019