doc/udev_assembly.txt - manifest_repos/lvm2 - Git at Google

 Automatic device assembly by udev
 =================================

 We want to asynchronously assemble and activate devices as their components
 become available. Eventually, the complete storage stack should be covered,
 including: multipath, cryptsetup, LVM, mdadm. Each of these can be addressed
 more or less separately.

 The general plan of action is to simply provide udev rules for each of the
 device "type": for MD component devices, PVs, LUKS/crypto volumes and for
 multipathed SCSI devices. There's no compelling reason to have a daemon do these
 things: all systems that actually need to assemble multiple devices into a
 single entity already either support incremental assembly or will do so shortly.

 Whenever in this document we talk about udev rules, these may include helper
 programs that implement a multi-step process. In many cases, it can be expected
 that the functionality can be implemented in couple lines of shell (or couple
 hundred of C).

 Multipath
 ---------

 For multipath, we will need to rely on SCSI IDs for now, until we have a better
 scheme of things, since multipath devices can't be identified until the second
 path appears, and unfortunately we need to decide whether a device is multipath
 when the *first* path appears. Anyway, the multipath folks need to sort this
 out, but it shouldn't bee too hard. Just bring up multipathing on anything that
 appears and is set up for multipathing.

 LVM
 ---

 For LVM, the crucial piece of the puzzle is lvmetad, which allows us to build up
 VGs from PVs as they appear, and at the same time collect information on what is
 already available. A command, pvscan --cache is expected to be used to
 implement udev rules. It is relatively easy to make this command print out a
 list of VGs (and possibly LVs) that have been made available by adding any
 particular device to the set of visible devices. In othe words, udev says "hey,
 /dev/sdb just appeared", calls pvscan --cache, which talks to lvmetad, which
 says "cool, that makes vg0 complete". Pvscan takes this info and prints it out,
 and the udev rule can then somehow decide whether anything needs to be done
 about this "vg0". Presumably a table of devices that need to be activated
 automatically is made available somewhere in /etc (probably just a simple list
 of volume groups or logical volumes, given by name or UUID, globbing
 possible). The udev rule can then consult this file.

 Cryptsetup
 ----------

 This may be the trickiest of the lot: the obvious hurdle here is that crypto
 volumes need to somehow obtain a key (passphrase, physical token or such),
 meaning there is interactivity involved. On the upside, dm-crypt is a 1:1
 system: one encrypted device results in one decrypted device, so no assembly or
 notification needs to be done. While interactivity is a challenge, there are at
 least partial solutions around. (TODO: Milan should probably elaborate here.)

 (For LUKS devices, these can probably be detected automatically. I suppose that
 non-LUKS devices can be looked up in crypttab by the rule, to decide what is the
 appropriate action to take.)

 MD
 --

 Fortunately, MD (namely mdadm) already comes with a mechanism for incremental
 assembly (mdadm -I or such). We can assume that this fits with the rest of stack
 nicely.


 Filesystem &c. discovery
 ========================

 Considering other requirements that exist for storage systems (namely
 large-scale storage deployments), it is absolutely not feasible to have the
 system hunt automatically for filesystems based on their UUIDs. In a number of
 cases, this could mean activating tens of thousands of volumes. On small
 systems, asking for all volumes to be brought up automatically is probably the
 best route anyway, and once all storage devices are activated, scanning for
 filesystems is no different from today.

 In effect, no action is required on this count: only filesystems that are
 available on already active devices can be mounted by their UUID. Activating
 volumes by naming a filesystem UUID is useless, since to read the UUID the
 volume needs to be active first.
	Automatic device assembly by udev
	=================================

	We want to asynchronously assemble and activate devices as their components
	become available. Eventually, the complete storage stack should be covered,
	including: multipath, cryptsetup, LVM, mdadm. Each of these can be addressed
	more or less separately.

	The general plan of action is to simply provide udev rules for each of the
	device "type": for MD component devices, PVs, LUKS/crypto volumes and for
	multipathed SCSI devices. There's no compelling reason to have a daemon do these
	things: all systems that actually need to assemble multiple devices into a
	single entity already either support incremental assembly or will do so shortly.

	Whenever in this document we talk about udev rules, these may include helper
	programs that implement a multi-step process. In many cases, it can be expected
	that the functionality can be implemented in couple lines of shell (or couple
	hundred of C).

	Multipath
	---------

	For multipath, we will need to rely on SCSI IDs for now, until we have a better
	scheme of things, since multipath devices can't be identified until the second
	path appears, and unfortunately we need to decide whether a device is multipath
	when the first path appears. Anyway, the multipath folks need to sort this
	out, but it shouldn't bee too hard. Just bring up multipathing on anything that
	appears and is set up for multipathing.

	LVM
	---

	For LVM, the crucial piece of the puzzle is lvmetad, which allows us to build up
	VGs from PVs as they appear, and at the same time collect information on what is
	already available. A command, pvscan --cache is expected to be used to
	implement udev rules. It is relatively easy to make this command print out a
	list of VGs (and possibly LVs) that have been made available by adding any
	particular device to the set of visible devices. In othe words, udev says "hey,
	/dev/sdb just appeared", calls pvscan --cache, which talks to lvmetad, which
	says "cool, that makes vg0 complete". Pvscan takes this info and prints it out,
	and the udev rule can then somehow decide whether anything needs to be done
	about this "vg0". Presumably a table of devices that need to be activated
	automatically is made available somewhere in /etc (probably just a simple list
	of volume groups or logical volumes, given by name or UUID, globbing
	possible). The udev rule can then consult this file.

	Cryptsetup
	----------

	This may be the trickiest of the lot: the obvious hurdle here is that crypto
	volumes need to somehow obtain a key (passphrase, physical token or such),
	meaning there is interactivity involved. On the upside, dm-crypt is a 1:1
	system: one encrypted device results in one decrypted device, so no assembly or
	notification needs to be done. While interactivity is a challenge, there are at
	least partial solutions around. (TODO: Milan should probably elaborate here.)

	(For LUKS devices, these can probably be detected automatically. I suppose that
	non-LUKS devices can be looked up in crypttab by the rule, to decide what is the
	appropriate action to take.)

	MD
	--

	Fortunately, MD (namely mdadm) already comes with a mechanism for incremental
	assembly (mdadm -I or such). We can assume that this fits with the rest of stack
	nicely.


	Filesystem &c. discovery
	========================

	Considering other requirements that exist for storage systems (namely
	large-scale storage deployments), it is absolutely not feasible to have the
	system hunt automatically for filesystems based on their UUIDs. In a number of
	cases, this could mean activating tens of thousands of volumes. On small
	systems, asking for all volumes to be brought up automatically is probably the
	best route anyway, and once all storage devices are activated, scanning for
	filesystems is no different from today.

	In effect, no action is required on this count: only filesystems that are
	available on already active devices can be mounted by their UUID. Activating
	volumes by naming a filesystem UUID is useless, since to read the UUID the
	volume needs to be active first.