Scsi_debug adapter driver for Linux

Introduction

Parameters

Supported SCSI commands

Logical and physical block size

Logical block provisioning

Adding and removing hosts and devices

Mode pages

doublestore and VERIFY

REPORT LUNS Well Known LU

SAS personality

Downloads

Conclusion

Please note: on 25 March 2020 the contents of this page was moved to: scsi_debug.html on this site. This page will receive no further updates.

Introduction

The scsi_debug adapter driver simulates a variable number of SCSI disks, each sharing a common amount of RAM allocated by the driver to act as (volatile) storage. With one SCSI disk simulated, the scsi_debug driver is functionally equivalent to a RAM disk. When multiple SCSI disks are simulated, they could be viewed as multiple paths to the same storage device or simply separate devices. The driver can also be used to simulate very large disks, 2 terabytes or more in size by "wrapping" its data access within the available ram.

A small but hopefully useful set of SCSI commands is supported along with some crude error checking. The number of simulated devices and the shared RAM size for storage can be given as module parameters or boot time parameters if the scsi_debug driver is built into the kernel. The number of simulated devices (and hosts) can be varied at run time via sysfs. Various error conditions can be optionally generated to test the reaction of upper levels of the kernel and applications to abnormal situations.

To create real SCSI targets and Logical Units for something like an iSCSI server, the reader should preferably be looking at the Linux target subsystem. In short: if you want create real SCSI devices then use the target subsystem; if you want to test or break something then read on.

This page describes the this driver as found in the Linux kernel version 4.18.0 and earlier versions of this driver worked with the Linux kernel 2.6 series. For information about the scsi_debug driver found in the lk 2.4 production series see this page.

Parameters

The parameter name given in the table below is the module parameter name and the sysfs file name. The boot time parameter (if the scsi_debug driver is built into the kernel (not recommended)) has "scsi_debug." prepended to it. Hence the boot time parameter corresponding to add_host=2 is scsi_debug.add_host=2 .

When the scsi_debug module is loaded, many parameters can be given on the command line, separated by spaces: for example to simulate 140 disks "modprobe scsi_debugmax_luns=2num_tgts=7add_host=10" could be used. This will generate 140 devices: 10 hosts, each with 7 targets, each with 2 logical units.

Sysfs parameters can be read with the cat command and written with the echo command. Sysfs expects a driver to be associated with a bus (e.g. PCI) so the "pseudo" bus was created for drivers like scsi_debug. An example:
# cd /sys/bus/pseudo/drivers/scsi_debug # catdev_size_mb8 # echo 1 >add_host
# echo 64 >dev_size_mb
In the /sys/module/scsi_debug/parameters directory the parameters used when the scsi_debug module was started (or their default values) are listed. Even though some of those parameters are have writeable permissions, writing to them had no effect on the driver. Note that some parameters appear in this directory but not in the /sys/bus/pseudo/drivers/scsi_debug directory. If the sysfs access cell for a parameter in the following table is blank then it doesn't appear in the /sys/bus/pseudo/drivers/scsi_debug directory but does appear in the /sys/module/scsi_debug/parameters directory.

Here is a list of scsi_debug specific driver parameters:

Parameter name	default value	sysfs access	sysfs write effect	new in version	notes
add_host	1	read-write	immediate		can add or remove hosts at runtime
ato	1	read only	-	1.81	application tag ownership (0 -> disk, 1 -> host)
cdb_len	10	read-write	next command	0187	6, 10, 12, 16 and 32 accepted, other numbers treated same as 10. Size of READs, WRITEs and MODE SENSEs generated by the sd driver for the block layer. When 32 is given, it is treated as if 16 was given.
clustering	0			1.84	enable large transfers
delay	1	read-write	next command		IO command response delay: units are jiffies (configurable: 1 to 10 ms) . 0: no delay, all in one thread; -1: use "hi" tasklet; -2: use normal tasklet
dev_size_mb	8	read only			units are Mebibytes (2**20 bytes)
dif	0	read-only		1.81	data integrity field type [T10: protection type]
dix	0	read-only		1.81	data integrity extension mask; check integrity when non zero
doublestore	0	read-write		1.88	when 0 (its default) then one data store (of dev_size_mb) is shared by every scsi_debug device. When set to 1 two data stores are created and they are allocated to scsi_debug devices in an alternating fashion. Note that each scsi_debug device has its own metadata (e.g. start/stop state and Unit Attention state).
dsense	0	read-write	immediate	1.81	0 -> fixed; 1-> descriptor sense format
every_nth	0	read-write	n commands from now		for error injection: 0 -> don't do error injection. When non zero (it can be negative) statistics parameter will be set to 1 if it isn't already.
fake_rw	0	read-write	next command	1.80	when set does no processing when a READ or WRITE command (of any cdb size) is received. When fake_rw=1 no ram is allocated.
guard	0	read-only		1.81	protection checksum: 0 -> crc; 1 -> ip
host_lock	0	read-write	next command	1.84, 1.88	when set wraps each submitted command a host_lock which is detrimental in a multi-queue system. From version 1.88 this parameter si ignored.
inq_product	"scsi_debug"			0187	user can set in module start parameters, 16 bytes long, pad spaces to the right to comply with SPC requirements.
inq_rev	driver_ver			0187	For example: "0187". Dropped decimal point and added leading "0" in this version. User can set in module start parameters, 4 bytes long
inq_vendor	"Linux"			0187	user can set in module start parameters, 8 bytes long, pad spaces to the right to comply with SPC requirements.
lbprz	1			2012	LB provisioning: returns 0s when reading unmapped block
lbpu	0			2012	LB provisioning: support UNMAP
lbpws	0			2012	LB provisioning: support WRITE SAME(16) and UNMAP
lbpws10	0			2012	LB provisioning: support WRITE SAME(10)
lowest_aligned	0			1.81	RCAP_16's lowest aligned logical block address (max: 0x3fff)
map		read-only			when logical block provisioning is active, it shows the internal provisioning map. Otherwise it shows '0-<sdebug_store_sectors>'.
max_luns	1	read-write	next positive add_host or scan		responds to luns: 0 ... (max_luns-1) or 1 ... (max_luns-1) if no_lun_0 is set. In 1.86 the max_lun maximum increased from 31 to 256. Uses LUN peripheral device addressing format (address_method=0, bus_identifier=0).
max_queue	192	read-write	next command	1.82	number of commands driver can queue before telling mid-level it is full. Safe to change when commands already queued.
medium_error_count	10	read-write		0188	only active when bit 1 (0x2) of the opts parameter is set
medium_error_start	0x1234	read-write		0188	only active when bit 1 (0x2) of the opts parameter is set
ndelay	0	read-write	next command	1.84	IO command response delay: units are nanoseconds. If > 0 then the delay parameter will be ignored (it appears as -9999)
no_lun_0	0	read-write	next positive add_host or scan	1.77	no lun 0 but responds to INQUIRY and REPORT LUNS as per SPC-2
no_uld	0	read only		1.82	device (LUs) created by this driver will only attach to sg and bsg devices. So depending one their ptype (peripheral device type) there will be no corresponding /dev/sd, /dev/sr, /dev/st* or /dev/ses* device nodes. There will be /dev/sg* and /dev/bsg/<h:c:t:l> device nodes.
num_parts	0	read only			number of partitions
num_tgts	1	read-write	next positive add_host or scan		targets per host
opt_blks	64			1.84	'Optimal transfer length' field in Block Limits VPD page
opt_xferlen_exp	physblk_exp			1.84	Controls 'Optimal transfer length granularity' field in Block Limits VPD page
opts	0	read-write	usually following commands		0 -> quiet and no error injection (mask 16 to inject aborted_command new in 1.81)
physblk_exp	0			1.81	2**physblk_exp sets READ CAPACITY(16)'s logical blocks per physical block exponent field
ptype	0	read-write	next positive add_host or scan		peripheral device type (0==disk)
random	0	read-write		1.89	when 0 (the default) delays the response of media access SCSI commands as precisely as possible to the duration indicated by the delay and ndelay parameters. When 1 it chooses a delay from a uniform distribution from no delay (0) to the duration indicated by the delay and ndelay parameters. When multiple threads are issuing commands random=1 can be used to simulate out-of-order responses.
removable	0	read-write			When non-zero sets the RMB bit in the INQUIRY response indicating the device is removable
scsi_level	7	read only			from: 0 (no compliance), 1, 2 (SCSI-2), 3 (SPC), 4 (SPC-2), 5 (SPC-3), 6 (SPC-4), 7 (SPC-5)
sector_size	512	read only		1.81	logical block size in bytes. 512, 1024, 2048 and 4096 accepted
statistics	0	read-write	next command	1.86	collect statistics that are output by 'cat /proc/scsi/scsi_debug/<host_num>'. Needs kernel built with CONFIG_SCSI_PROC_FS selected. Due to sysfs policy this is not superseded by sysfs. [Don't believe everything you read in kernel config menus.]
strict	0	read-write	next command	1.85	check for bits set in the reserved part of SCSI command blocks. If found report with the position of the first offending bit.
submit_queues	1	read-only		0187	multi-queue setting from 1 (i.e. non-mq) to <= number of processors on the machine
unmap_alignment	0			1.81	Block limits VPD page's unmap granularity alignment
unmap_granularity	1			1.81	Block limits VPD page's optimal unmap granularity
unmap_max_blocks	0xffffffff			1.81	Block limits VPD page's maximum unmap LBA count
unmap_max_desc	255			1.81	Block limits VPD page's maximum unmap block descriptor count
uuid_ctl	0	read only		1.86	if 1 then each LU name is an internally generated UUID; if 2 then all LUs shared the same UUID and if 0 then the LU name is a locally assigned NAA
virtual_gb	0	read-write	immediate, next READ CAPACITY	1.79	When 0 then device is dev_size_mb sized ram disk. When n > 0, "virtual" n Gibibyte size disk, wrapping on dev_size_mb actual ram. The Gibibyte unit is 2**30 bytes
vpd_use_hostno	1	read-write	next positive add_host or scan	1.80	the driver generates serial numbers and SAS naa-5 addresses based on host number ("hostno"), target id and lun. When set to 0, the generated numbers ignore "hostno".
wp	0	read-write		1.89	This parameter is for Write Protection. When 0 (the default) store modifying data access commands are permitted. When 1 store modifying data access commands are not allowed.
write_same_length	0xffff			2012	maximum blocks per WRITE SAME command
zbc	0 ['none']	read-only		1.89	The default is 0 or the string 'none'. To specify host-aware scsi_debug devices use 1 or 'host-aware'; this is currently not implemented and does nothing. To specify host-managed scsi_debug devices use 2 or 'host-managed' which will set the ptype (i.e. the Peripheral Device Type (pdt)) to 0x14. The two latter strings can be shortened to 'aware' and 'managed'. After the scsi_debug module is loaded with 'zbc=managed' say, using sysfs to change this parameter to 'none' will turn all scsi_debug devices into normal disk simulations with a pdt of 0x0.
zone_max_open	8			1.89
zone_nr_conv				1.89
zone_size_mb				1.89

The add_host parameter is the number of hosts (HBAs) to simulate. The default is 1. For boot time and module loads the allowable values are 0 through to a large positive number. For sysfs writes, a value of 0 does nothing while a positive number adds that many hosts and a negative number removes that number of hosts. A sysfs read of this parameter shows the current number of hosts scsi_debug is simulating. No more than num_tgts target ids will be used per host. Target ids are in ascending order from 0 excluding the target id that is used by the initiator (i.e. HBA) if any. The default setting of num_tgts is 1. The default setting for max_luns is 1. So the number of pseudo disks simulated at driver initialization time is (add_host * num_tgts * max_luns). Note that if any of these three parameters is set to zero at kernel boot time or module load time then no devices are created. Modifying the add_host parameter in sysfs can be used to simulate hot plugging and unplugging of hosts. See below for adding and deleting individual scsi devices

The ato parameter sets the field of the same name in the control mode page. The default value is 1 which implies the host is the application tag owner. A value of 0 implies the device server (e.g. the (pseudo) disk) is the application tag owner.

The cdb_len parameter controls the SCSI cdb lengths generated by the sd driver typically when it receives requests from the block layer. There are 3 bool internal variables: use_10_for_rw, use_16_for_rw and use_10_for_ms. "ms" is MODE SENSE/SELECT whose cdb can be 6 or 10 bytes long. If both use_10_for_rw and use_16_for_rw are false then READ(6) or WRITE(6) is used if the LBA and the number_of_blocks are not too large. This parameter can have these settings:

6: use_10_for_rw=false, use_16_for_rw=false, use_10_for_ms=false: try to use READ(6), WRITE(6) and MODE SENSE(6)
10: use_10_for_rw=true, use_16_for_rw=false, use_10_for_ms=false: try to use READ(10), WRITE(10) and MODE SENSE(6)
12: use_10_for_rw=true, use_16_for_rw=false, use_10_for_ms=true: try to use READ(10), WRITE(10) and MODE SENSE(10)
16: use_10_for_rw=false, use_16_for_rw=true, use_10_for_ms=true: try to use READ(16), WRITE(16) and MODE SENSE(10)
others: mapped to 10

Note that this parameter has no control over the sd driver's use of READ(32) and WRITE(32) commands which are generated for some settings of Protection Information (PI).

The clustering parameter informs the SCSI mid layer whether (1) or not (0) clustering is enabled. The default is that is not (0) enabled. Setting this parameter facilitates large transfers of data with a single command.

The delay parameter is the number of jiffies by which the driver will delay responses. The default is 1 jiffy unless the ndelay parameter is given, see its description. Setting this parameter to 0 will cause the response to be sent back to the mid level before the request function is completed. The "jiffy" is a kernel space jiffy (typically the largest HZ figure yields a 1 millisecond on i386) rather than a user space jiffy (USER_HZ is typically 10 milliseconds on i386). HZ and USER_HZ are configurable in the kernel build. Both delayed and immediate responses are permitted however delayed responses are more realistic. For delayed responses, a kernel timer is used. [Real adapters would generate an interrupt when the response was ready (i.e. the command had completed).] For a fast ram disk set the delay parameter to 0. These SCSI commands ignore the delay parameter and respond immediately: INQUIRY, REPORT LUNS, REQUEST SENSE, SYNCHRONIZE CACHE plus various other non "media access" commands. TEST UNIT READY is considered a media access command.

The delay parameter may be set to -1 or -2 which uses a kernel tasklet to generate a more or less immediate response (but in a different kernel thread). The -1 variant schedules a high priority tasklet while -2 schedules a normal priority tasklet. Trying to write a new value to delay while there are queued command responses may result in an EBUSY error.

The Start

The dev_size_mb parameter allows the user to specify the size of the simulated storage. The unit is Mebibytes (each 2**20 bytes and a bit larger than a Megabyte) and the default value is 8. The maximum value depends on the capabilities of the vmalloc() call on the target architecture. If the module fails to load with a "cannot allocate memory" message then a "vmalloc=nn{KMG}" boot time argument may be needed. [See the kernel source file: Documentation/kernel-parameters.txt for more information on this.] The RAM reserved for storage is initialized to zeros which leads the sd (scsi disk) driver and the block layer to believe there is no partition table present. Partitions can be simulated with num_parts (see below). All simulated dummy devices share the same RAM. If a value of 0 or less is given then dev_size_mb is forced to 1 so 1 MB of RAM is used. Given 512 byte logical blocks, the largest ramdisk that can be allocated is 2 TB but it is unlikely a system would be able to allocate that much ram (a situation that would be bypassed if fake_rw=1). Very large amounts of "virtual" storage can be simulated with the virtual_gb parameter (see below).

The dif parameter sets the T10 protection type which is a value between 0 and 3 where 0 (the default) is no protection. Protection information is extra bytes of data (typically 8) associated with blocks of data transferred between a SCSI initiator and a SCSI block logical unit (as defined in T10 SBC standards). T10 protection information is often called the "data integrity field" hence the name DIF. For information about DIF and DIX see https://oss.oracle.com/projects/data-integrity/documentation/ .

The dix parameter when set causes protection information to be carried between the operating system and the SCSI initiator. DIX is an abbreviation of "data integrity eXtension" and can be viewed as a front end to DIF. When its value is zero (the default) then no protection information is carried within the operating system. When the dix parameter is a non zero value then the the dix type will be the same as the dif parameter. So if dif=2 and dix=1 then both DIF and DIX are set to type 2 protection. Note that if dif=0 it doesn't matter what the dix parameter is, both DIF and DIX are set to type 0 protection (which is no protection).

The every_nth parameter takes a decimal number as an argument. When this number is greater than zero, then incoming commands are counted and when <n> is reached then the associated command generates some sort of error. Currently the available errors are timeout (when "opts & 4" is true) and RECOVERED_ERROR (when "opts & 8" is true) . Once the command count reaches <n> then it is reset to zero. For example setting every_nth to 3 and opts to 4 will cause every third command to be ignored (and hence a timeout). If every_nth is not given it is defaulted to 0 and timeouts and recovered errors will not be generated. Note that for the "every nth" mechanism to work the statistics parameter needs to be set.

If every_nth is negative then an internal command counter counts down to that value and when it is reached, continually generates the error condition (specified in opts) on each newly received command. The driver flags this continual error state by setting every_nth to -1 . The user can stop error conditions being generated on receipt of every subsequent command by writing 0 to every_nth (or opts ).

The fake_rw parameter instructs the scsi_debug driver to ignore all READ and WRITE commands and return a GOOD status. This means the data "read" when fake_rw is set is whatever was previously in the scatter gather list. The default value is 0 (i.e. process READ and WRITE commands). This parameter is for testing and when set can confuse the kernel or utilities that look for partitions and other information on a "disk".

The guard parameter when set to zero (the default) use T10 defined CRC in the protection information. When set to one the IP (internet protocol) checksum (as used by iSCSI ?) is used.

The host_lock parameter indicates whether each command (excluding its response delay and associated callback into the mid-layer) is surrounded by a per host host_lock (which is a kernel "spin lock"). In a SCSI multi-queue system the presence of this host lock will have the effect of serializing all commands form a host; and that is detrimental to system performance. Prior to version 1.84 this parameter was not available and the host_lock surround all commands. In version 1.84 and later the default is 0 which means the host_lock is not applied. Set host_lock=1 for the old behaviour. In version 1.88 this functionality (i.e. the host_lock) was removed and setting this parameter has no effect. It is kept so that scripts that call it will not break.

The inq_product parameter is the 16 byte ASCII string (left justified, space characters to the right) that get reported by this driver's standard INQUIRY response. The default is "scsi_debug ".

The inq_rev parameter is the 4 byte ASCII string (left justified, space characters to the right) that get reported by this driver's standard INQUIRY response. The driver version number (was "1.86") has been reformatted to be suitable for this field. The default value is now "0187" and will increase as changes are added to this driver.

The inq_vendor parameter is the 8 byte ASCII string (left justified, space characters to the right) that get reported by this driver's standard INQUIRY response. The driver is "Linux ".

The lbpu parameter, if set, causes the logical block partitioning VPD page to set the field of the same name. The default is to set the LBPU field to 0. When set this field indicates the UNMAP command is supported.

The lbpws and lbpws10 parameters cause the corresponding bits in the logical block partitioning VPD page to be set. The imply the the UNMAP field within the WRITE SAME(16) and WRITE_SAME(10) respectively are supported.

The lbprz parameter, if set, causes the logical block partitioning VPD page to set the field of the same name. When this field is set reading unmapped logical blocks will yield block(s) of data full of xeros to be returned.

The lowest_aligned parameter sets the field called LOWEST ALIGNED LOGICAL BLOCK ADDRESS in the READ CAPACITY (16) command response.
The default is zero which implies the logical block size and the physical block size are the same.

The max_luns parameter allows an upper limit to be placed on the logical unit number (lun) that the scsi_debug driver will respond to. A value of 2 means that this driver will respond to logical unit numbers 0 and 1. If max_luns is modified by a sysfs write then the scsi_debug driver modifies the scsi_host::max_lun member of all hosts that it owns. When max_luns is modified by a sysfs write then it will take effect the next time a host is added (see add_host) or when a scan is done on any existing host. The mid level scanning code will scan for up to but not including max_scsi_luns which is a SCSI mid level boot and module load time parameter.

The max_queue parameter indicates the maximum number of queued responses the driver can handle. This defaults to an internal define in the scsi_debug driver called SCSI_DEBUG_CANQUEUE which is currently 192 (on 64 bit machines, 96 or 32 bit machines). If both the delay and ndelay parameters are 0, no commands have queued responses. If there is an attempt to exceed this value then either SCSI_MLQUEUE_HOST_BUSY is returned to the mid-layer (the default) or a status of TASK_SET_FULL (if the 0x200 opts mask is set). Sysfs can be used at any time to change the value of max_queue, even when the are queued command responses.

The medium_error_count parameter indicates the number of blocks, including the medium_error_start LBA, on which to yield a SCSI MEDIUM ERROR sense key. This only occurs when the opts parameter has its bit 1 (i.e. 0x2) set. Its default value is 10.

The medium_error_start parameter indicates the first LBA to yield a SCSI MEDIUM ERROR sense key. This only occurs when the opts parameter has its bit 1 (i.e. 0x2) set. Its default value is 0x1234 (4660 in decimal).

The ndelay parameter is the response delay whose units are nanoseconds. This mechanism depends on high resolution timers in the kernel which may not be supported on small or old system (it is a kernel build config option). Its default value is 0 which means the delay parameter is operative. If ndelay is a positive value then a response delay for that many nanoseconds is active (and to indicate the delay parameter is overridden, it is set to -9999). Depending on the hardware, setting ndelay to less than a few microseconds probably causes no further reduction in the observed response delays. Trying to write a new value to ndelay while there are queued command responses may result in an EBUSY error.

The no_lun_0 parameter when set to a non zero value causes a lun 0 INQUIRY response of peripheral_qualifier==3 indicating there is no actual lu there. As required by SPC, lun 0 will still respond to the a REPORT LUNS command. If the REPORT LUNS has a 'select report' code of 1 or 2, then one of the luns reported will be the REPORT LUNS well known logical unit (lun 49409 or 0xc101). The default value is 0. If max_luns is greater than 1, the the first lun generated by scsi_debug will be lun 1 (since lun 0 is skipped). The REPORT LUNS well known logical unit (wlun) only supports the INQUIRY, REPORT LUNS, REQUEST SENSE and TEST UNIT READY SCSI commands. To make this wlun appear as a scsi generic (sg) device see the REPORT LUNS well known LUN example below.

The num_parts parameter writes a partition table to the ramdisk if the parameter's value is greater than 0. The default is 0 so in that case the ramdisk is simply all zeros. When num_parts is greater than zero a DOS format primary partition block is written to logical block 0, so the number of partitions is limited to a maximum of 4. The partitions are given an id of 0x83 which is a "Linux" partition. The available space on the ramdisk is roughly divided evenly between partitions when 2 or more partitions are requested. The partitions are not initialized with any file system. Even if no partitions are specified, a utility like fdisk can be used to added them later.

The num_tgts parameter allows the number of targets per host to be specified. It should be 0 or greater. Target id numbers start at 0 and ascend, bypassing the target id of the initiator (i.e. the HBA). If num_tgts is modified by a sysfs write then the scsi_debug driver modifies the scsi_host::max_id member of all hosts that it owns. When num_tgts is modified by a sysfs write then it will take effect the next time a host is added (see add_host) or when a scan is done on any existing host.

The opt_blks parameter is placed in the "Optimal transfer length" field of the Block Limits VPD page. Its default value is 64.

The opt_xferlen_exp parameter (with help from the physblk_exp parameter) controls the "Optimal transfer length granularity" field (OTLG) in the Block Limits VPD page. If 0 (default) or less than, or equal to, physblk_exp then the OTLG field is set to 2**physblk_exp making physblk_exp the effective default value. Otherwise, if this parameter is greater than physblk_exp then the OTLG field is set to 2**opt_xferlen_exp .

The opts parameter takes a number as an argument which is the bitwise "or" of several flags. The flags that mention "nth" are only active when every_nth != 0 . So-called "read-write" commands include some others such as VERIFY. The flags supported are:

1 - "noisy" flag: all calls to entry points of driver are logged. Commands to be executed are shown in hex. Additional information such as check conditions, command aborts and resets are logged
2 - "medium error" flag: simulates a SCSI MEDIUM ERROR when LBA medium_error_start (default: 0x1234 (4660 in decimal)) is read. The following medium_error_count blocks (default: 10 blocks) less 1 also yield a medium error.
4 - ignore "nth" command causing a timeout.
8 - cause "nth" read or write command to yield a RECOVERED_ERROR.
0x10 - cause "nth" read-write command to yield an ABORTED_COMMAND (ack/nak timeout) which is a SAS transport error.
0x20 - cause "nth" read-write command to yield an ABORTED_COMMAND (logical block guard check failed), nominally a DIF (Protection Information) error
0x40 - cause "nth" read-write command to yield an ABORTED_COMMAND (logical block guard check failed), nominally a DIX error
0x80 - ignore "nth" media access command causing a timeout
0x100 - cause "nth" read command to yield half the data it was requested to read
0x200 - log generation of TASK SET FULL and host busy plus changes to queue depth and type
0x400 - if max_queue is exceeded yield a TEST SET FULL (default: host busy)
0x800 - cause "nth" read-write command whose queue_depth is at it maximum value to yield a status of TASK SET FULL
0x1000 - set WCE field in the caching page to 0 (default WCE=1)
0x2000 - log only abort commands and the various levels of reset
0x4000 - used together with the noisy flag (1) to suppress the logging of cdbs; additional information (if any) is still logged.
0x8000 - cause "nth" read or write command to yield a "host busy" (mid-level sent SCSI_MLQUEUE_HOST_BUSY)
0x10000 - cause "nth" read-write command to be aborted (via a call to block layer)

The opts "noisy" (or debug) flag will cause all scsi_debug entry points to be logged in the system log (and often sent to the console depending on how kernel informational messages are processed). With this flag set commands are listed in hex and if they yield a result other than successful then that is shown. In a busy system this may prove to be too much log "noise" in which case this combination of flags may be useful: opts=0x6201 .

The opts "medium error" flag will cause any read command whose LBA start at medium_error_start (default: 0x1234 (4660 in decimal)) for medium_error_count blocks to return a medium error indication to the mid level. The "ignore nth" flag is only active when every_nth != 0 . When an internal command counter reaches the value in every_nth and the "ignore nth" flag is set, then this command is ignored (i.e. quietly not processed). Typically this will cause the SCSI mid level code to timeout the command which leads to further error processing. The internal command counter is reset to zero whenever opts is written to, whenever every_nth is written to, when the every_nth value is reached and at driver load time. The "recovered error" flag works in a similar fashion to the "ignore nth" flag, however when the every_nth value is reached and it is either a read or a write command then the command is processed normally but yields a "recovered error" indication. Such an indication is _not_ a hard error but for a real disk could indicate deteriorating media. The "aborted command" flag injects a transport error in a similar fashion to the way the "recovered error" flag works. A minor point: the kernel boot time and module load time opts parameter is a decimal integer. However the output sysfs value is a hexadecimal number (output as 0x9 for example) while the input value is interpreted as hexadecimal if prefixed by "0x" and decimal otherwise. When combining these flags it is easier to consider them as hexadecimal numbers.

The physblk_exp parameter becomes the "Logical blocks per physical block exponent" field in the READ CAPACITY (16) response. The default value is 0 which means the logical block and physical block sizes are the same.

The ptype parameter allows the SCSI peripheral type to be set or modified. The default value is 0 which corresponds to a disk. Other useful peripheral types are 1 for tape, 3 for processor, 5 for dvd/cd and 13 for enclosure (SES).

The scsi_level parameter is the ANSI SCSI standard level that the simulated disk announces that it is compliant to. The INQUIRY response which is generated by scsi_debug contains the ANSI SCSI standard level value (in byte 2).

The sector_size parameter (default 512) is the logical block size in bytes (assuming ptype=0 which means a block storage device).

The statistics parameter controls whether several internal counters are incremented or not. For speed the default is 0 (i.e. don't collect statistics). The "every nth" mechanism requires those internal counters so specifying a non-zero every_nth parameter will cause the statistics collection to be turned on.

The strict parameter can be 1 or 0 (the default). If 1 then it uses the cdb mask given in the REPORT SUPPORTED OPERATION CODES command to check each command cdb received by this driver. If any bit is set in the cdb but the corresponding bit is not set in the mask, then the command is rejected with a status of CHECK CONDITION, a sense key of ILLEGAL REQUEST and additional sense of INVALID FIELD in CDB. The sense data also points to the byte and bit position in the cdb that first failed the mask comparison. Byte long (and longer) fields will always point at bit 7 as failed. Each cdb is scanned in ascending byte order.

The submit_queue parameter sets the number of submission queues the SCSI multi-queue logic will maintain for this driver. The default value is 1 which implies no multi-queue. If a value is given that exceeds the number of processors on the machine then the value used will be the number of processors on the machine. A warning is issued to the log if the driver reduces this value.

The uuid_ctl parameter controls whether a locally assigned NAA (64 bit value) is used to identify each logical unit (LU) simulated by this driver, or if a UUID (128 bit, RFC 4122) is used. If the value is 0 (the default) a locally assigned NAA is used. If the value is 1 then a new UUID (effectively a random value) is generated for each LU. If the value is 2 then the same generated UUID is used for all LUs simulated by this driver.

The virtual_gb parameter allows the scsi_debug driver to simulate a much larger storage device than physical RAM available in the machine. When the virtual_gb parameter is 0 (its default value) then the maximum storage available is that indicated by the dev_size_mb parameter. When the virtual_gb parameter is greater than zero, that many Gibibytes (each of 2**30 bytes and larger than a Gigabyte) are reported by the READ CAPACITY command. Reading and writing of the "Gigabytes" of data wraps around within the available physical ram (which the scsi_debug driver has allocated and is dev_size_mb Mebibytes in size). When the number of virtual Gibibytes is 2048 or greater then READ CAPACITY (16) is needed to represent the size and READ (16) and/or WRITE (16) are needed to access data at the 2048 Gibibyte boundary and beyond. This boundary represents 2**32-1 blocks (sectors) assuming 512 bytes long. The "wrapping" action still allows partitions to be written with fdisk and in many cases a file system to be initialized. Trying to store and retrieve any useful data on such a big virtual disk would not be wise! Setting the dev_size_mb parameter to a prime number, larger than the default value (which is 8) and that doesn't starve the machine for resources, seems to help in creating ext3 file systems. This occurs since mkfs writes the file system super block at several offsets within the partition, and the wrap may cause the file system header to be overwritten. The virtual_gb option is designed for testing, not practical data storage.

The vpd_use_hostno parameter affects the way the scsi_debug driver generates its serial numbers, SAS and naa-5 addresses. When vpd_use_hostno is set to 1 (its default value) then the host number ("hostno"), target_id and lun are used to generate the serial number, SAS and naa-5 addresses. The formula is "((hostno + 1) * 2000) + (target_id * 1000) + lun)". When vpd_use_hostno is set to 0 then the "hostno" term in the formula is set to 0. This has the affect of making multiple simulated hosts look like they are connected to the same drives (i.e. there are only "num_tgts * max_luns" unique simulated devices). The kernel will still report "add_host * num_tgts * max_luns" devices but higher level multipath aware software may see the difference

Supported SCSI commands

Below is a list of supported commands. Some do nothing (e.g. SYNCHRONIZE CACHE). Those that have interesting functionality have notes in brackets. If the feature was introduced in a recent version (i.e. since 1.76) then that is noted.

CLOSE ZONE [added in 1.89]
COMPARE AND WRITE [1.85: added]
FINISH ZONE [added in 1.89]
GET LBA STATUS
INQUIRY [vital product data pages: 0, 0x80, 0x83] [1.77: VPD pages: 0x85, 0x86, 0x87, 0x88, 0x89, 0xb0] [1.87: VPD pages: 0x84, 0xb1, 0xb2] [1.89: VPD page: 0xb6]
LOG SENSE [1.78: temperature(0xd) and informational exceptions(0x2f)] [1.80: support log subpages]
MODE SELECT (6), MODE SELECT (10) [1.84: changeable pages: 0x8 (caching), 0xa (control) and 0x1c (informational exceptions)]
MODE SENSE (6), MODE_SENSE (10) [sense pages: 1 (rw error recovery), 2 (disconnect), 3 (format), 8 (caching), 0xa (control), 0x1c (informational exceptions), 0x3f (read all)] [1.77: subpage support plus SAS pages: 0x19,0 0x19,1 and 0x19,2]
OPEN ZONE [added in 1.89]
PRE-FETCH(10), PRE-FETCH(16) [added in 1.89]
PREVENT ALLOW MEDIUM REMOVAL
READ (6), READ (10), READ(12), READ(16), READ(32)
READ CAPACITY (10), READ CAPACITY (16) [1.79: added 16 byte command]
RELEASE (6), RELEASE (10)
REPORT LUNS [1.77: shows REPORT LUNS wlun]
REPORT REALMS [added in 1.89, not implemented]
REPORT SUPPORTED OPERATION CODES [1.85: added]
REPORT SUPPORTED TASK MANAGEMENT FUNCTIONS [1.85: added]
REPORT TARGET PORT GROUPS
REPORT ZONES [added in 1.89]
REQUEST SENSE [1.79: shows MRIE=6 failure prediction, power states]
RESERVE (6), RESERVE (10)
RESET WRITE POINTER [added in 1.89]
REZERO UNIT (which is REWIND for tapes)
SEND DIAGNOSTIC [1.78: maintains start and stop states, when stopped fails media access commands]
SYNCHRONIZE CACHE (10, 16)
TEST UNIT READY [1.78: in stopped state gives appropriate error]
UNMAP
VERIFY (10), VERIFY(16) supporting BYTCHK=1 or 3 [added in 1.89]
WRITE (6), WRITE (10), WRITE (12), WRITE (16), WRITE(32)
WRITE BUFFER
WRITE SAME(10), WRITE SAME(16)
WRITE SCATTERED(16, 32)
UNMAP
<< XDWRITEREAD (10) [which is a bidirectional command] [removed around lk 5.0] >>

The implementations of the above commands are sufficient for the scsi subsystem to detect and attach devices. The fdisk, e2fsck and mount commands also work as do the utilities found in the sg3_utils package (see the main page). Crude error processing picks up unsupported commands and attempts to read or write outside the available RAM storage area.

Modern SCSI devices use vital product page 0x83 for identification. This driver yields both "T10 vendor identification" and "NAA" descriptors. The former yields an ASCII string like "Linux scsi_debug 4000" where the "4000" is the ((host_no + 1) * 2000) + (target_id * 1000) + lun). In this case "4000" corresponds to host_no==1, target_id==0 and lun==0. The "NAA-5" descriptor is an 8 byte binary value that looks like this hex sequence: "51 23 45 60 00 00 0f a0" where the IEEE company id is 0x123456 (fake) and the vendor specific identifier in the least significant bytes is 4000 (which is fa0 in hex). [The "4000" is derived the same way for both descriptors.]

Read and write commands executed by the scsi_debug driver are atomic (i.e. a write to one scsi_debug device will not interrupt (split) a read from another scsi_debug device. So a read command will either yield the contents of ram before a co-incident write, or after the co-incident write has finished.

The START STOP UNIT (SSU) and SYNCHRONIZE CACHE (SC) commands have special longer delay processing from version 0188 onward. For both commands if ndelay <= 10,000 (10 microseconds) then long delays are ignored. Otherwise SSU has at least a 1 second delay and if delay > 1 then its delay is that many seconds. And for SC its longer delay is 1/20 that of SSU (e.g. if delay=2 then SSU's delay is 2 seconds and SC's delay is 100 milliseconds).

Logical and physical block size

scsi-debug supports emulating devices with logical block sizes bigger than 512 bytes. This can be specified using the sector_size option.

Some storage devices use physical block sizes bigger than 512 bytes internally but expose a 512-byte logical block size to the host for compatibility reasons. The physblk_exp parameter can be used to indicate that the internal block size is 2^n times bigger than the reported logical block size. For instance: Supplying physblk_exp=3 on the command line will cause scsi_debug to simulate a device with 512-byte logical blocks and 4KB physical blocks.

Not all storage devices have logical block 0 aligned to a physical block boundary. These devices can be emulated using scsi_debug's lowest_aligned option. The parameter indicates the lowest LBA that is aligned to a physical block boundary.

Logical block provisioning

SBC-3 introduced Logical block provisioning. That term covers both "thin provisioning" (the earlier term for this facility) and "over provisioning" as used in modern SSDs.

Thin provisioning means that devices can report a capacity that is bigger than the space actually allocated. When files are deleted, the relevant blocks can be reclaimed by the storage device and used for something else. And consequently only blocks that are actively in use consume physical storage space.

SBC-3 specifies two different approaches for marking blocks as unused: WRITE SAME(16) with the UNMAP bit set, and the UNMAP command. scsi_debug supports both methods and they are controlled via 4 module parameters:

unmap_max_desc specifies the maximum number of ranges that can be unmapped using a single UNMAP command. If this is set to 0, only WRITE SAME is supported and UNMAP will cause a check condition.
unmap_granularity specifies the granularity at which to track mapped blocks (specified in number of logical blocks). 2048 (1 MB) is a realistic value for disk arrays although some may have a finer granularity.
unmap_alignment specifies the first LBA which is naturally aligned on an unmap_granularity boundary.
unmap_max_blocks specifies the maximum number of blocks that can be unmapped using a single UNMAP command. Default is 0xffffffff.

Examples:

 modprobe scsi_debug lbpws=1 unmap_max_desc=0 unmap_granularity=1

will simulate a device that only supports WRITE SAME(16) and which tracks usage on a per logical block basis. This is how most solid state drives work.

 modprobe scsi_debug lbpu=1 unmap_max_desc=64 unmap_granularity=2048

will simulate a device that supports UNMAP and which is provisioned in 1MB chunks. This is a common scenario for thinly provisioned storage arrays.

The current block allocation bitmap can be viewed from user space via:

 cat /sys/bus/pseudo/drivers/scsi_debug/map

Unit attentions

An important feature of the SCSI command sets is the concept of a Unit Attention (UA). This is a mechanism for the "device server" within a logical unit (e.g. a disk) to report to the originator (e.g. a user space program, a file system or the kernel) that something, not directly related to the command that was just sent, has happened. That report takes the form of the command not being done and sense data with the UNIT ATTENTION sense key being returned. Additional information about the UA is provided in the sense data and the originator is expected to take note. UAs are typically only reported once so if the initiator repeats the command it should work (or a different type of UA might be delivered).

An example might make this clearer. It is possible to change the number of logical blocks on a disk; the FORMAT command could do that. In the scsi_debug driver even though dev_size_mb cannot be changed at run time, the virtual_gb parameter can be. If the the virtual_gb parameter is changed (via sysfs, after the driver has been running), then the "Capacity data has changed" UA condition is set. The next command sent to that device will receive that UA (with some exceptions) in the returned sense data (and the command is not done). The exceptions are the INQUIRY, REPORT LUNS and REQUEST SENSE commands which skip UA reporting (see SAM-5 for details). Once the originator sees that UNIT ATTENTION sense key, it should note the reason, and repeat the command unless it is directly impacted. If the command that got "hit" by this UA was a READ or a WRITE then the originator might want to do a READ CAPACITY command first, at least to check that the LBA given to the READ or WRITE command was still in range.

The scsi_debug driver reports these Unit Attentions:

Power on, reset, or bus device reset occurred
SCSI bus reset occurred
Mode parameters changed
Capacity data has changed

If there is more than one UA, then they are reported in the ascending order of that list.

Zoned Block Devices

In version 1.89 of this driver support was added for "host-managed" Zone Block devices that comply with the "sequential write required" model. All scsi_debug devices generated when this module is loaded with the "zbc=host-managed" parameter will be of this type, which has a ptype value of 0x14 (i.e. their SCSI Peripheral Device Type (pdt)).

Examples

Basic

Since scsi_debug is for testing it seems more useful to build it as a module rather than build it into the kernel. Some parameters cannot be changed once the scsi_debug driver is running. So if it is a module then it can be removed with rmmod and reloaded with another modprobe call with the desired parameters.

When the driver is loaded successfully simulated disks should be visible just like other SCSI devices:

# modprobe scsi_debug

# lsscsi -s
[0:0:0:0] disk SEAGATE ST33000650SS 0005 /dev/sda 3.00TB
[0:0:1:0] enclosu Intel RES2SV240 0d00 - -
[4:0:0:0] disk ATA ST3160812AS D /dev/sdb 160GB
[7:0:0:0] disk Linux scsi_debug 0184 /dev/sdc 8.38MB

In this case there is a 3 TB SAS disk, an ATA disk and a small scsi_debug pseudo disk. The other device (at [0:0:1:0]) is a SCSI Enclosure Service (SES) device. The /dev/sdc pseudo disk is full of zeros and has no partitions. To get a partition the num_parts parameter could have been used on the modprobe line or it could be done from the command line with the fdisk /dev/sdc command. Assuming one ext3 partition is allocated to the whole pseudo disk (8 MB in this case) then the mkfs.ext3 /dev/sdc1 command can be used to make an ext3 file system. Now /dev/sdc1 can be mounted and treated like a normal file system. Naturally when the power is turned off anything stored in /dev/sdc1 will be forgotten.

Rather than mounting the pseudo disk, the sg3_utils package could be used to carry out various tests on it.

Information about the scsi_debug driver version, its current parameters and some other data can be found in the "proc" file system. The trailing number in the path is the scsi_debug host number which is the first element in the 4 item tuple shown in the lsscsi above :

# cat /proc/scsi/scsi_debug/3
scsi_debug adapter driver, version 0189 [20200225]
num_tgts=1, shared (ram) size=1024 MB, opts=0x0, every_nth=0
delay=-9999, ndelay=100000, max_luns=10, sector_size=512 bytes
cylinders=130, heads=255, sectors=63, command aborts=0
RESETs: device=0, target=0, bus=0, host=0
dix_reads=0, dix_writes=0, dif_errors=0
usec_in_jiffy=1000, statistics=0
cmnd_count=0, completions=0, miss_cpus=0, a_tsf=0
submit_queues=1
  queue 0:

Here is an important sysfs directory for the scsi_debug driver:

# cd /sys/bus/pseudo/drivers/scsi_debug/

# ls -x
adapter0        add_host       ato        bind         cdb_len      delay
dev_size_mb     dif            dix        doublestore  dsense       every_nth
fake_rw         guard          host_lock  map          max_luns     max_queue
ndelay          no_lun_0       no_uld     num_parts    num_tgts     opts
ptype           random         removable  scsi_level   sector_size  statistics
strict          submit_queues  uevent     unbind       uuid_ctl     virtual_gb
vpd_use_hostno  zbc

Those files are most of the scsi_debug parameters, those that are writable can be modified and the scsi_debug actions will change accordingly thereafter. Certain parameters cannot be changed while the driver is busy (e.g. it has queued command responses), in which case EBUSY is returned if the user attempts to change one. Reading one can be done with the cat command and changing one can be done with the echo command:

# cat every_nth
0
# echo 2000 > every_nth


Another important sysfs directory for (any) disks is /sys/block/<disk_node_name> and its queue sub-directory. So in this case of this scsi_debug pseudo disk that directory would be /sys/block/sdc/queue . Also there is the scsi_device sysfs directory that has the form /sys/class/scsi_device/<h:c:t:l>/device where the <h:c:t:l> tuple is found at the left hand side of each device listed by lsscsi. This sysfs directory contains many important SCSI device parameters some of which can be modified.

Adding and removing hosts and devices

Individual devices can be removed via sysfs and the mid-level by writing any value into the "delete" member in the sysfs directory corresponding to the scsi device. Given these devices:


# lsscsi -s
[0:0:0:0] disk SEAGATE ST200FM0073 0A04 /dev/sda 200GB
[4:0:0:0] disk ATA ST3160812AS D /dev/sdb 160GB
[7:0:0:0] disk Linux scsi_debug 0184 /dev/sdc 21.4GB

then the scsi_debug (pseudo) disk can be deleted like this:

# echo 1 > /sys/class/scsi_device/7:0:0:0/device/delete

After which this should be seen:

# lsscsi -s
[0:0:0:0] disk SEAGATE ST200FM0073 0A04 /dev/sda 200GB
[4:0:0:0] disk ATA ST3160812AS D /dev/sdb 160GB

This will work for any scsi device (not just those belonging to scsi_debug). That scsi device can be re-added with the following command:

# echo "0 0 0" > /sys/class/scsi_host/host7/scan

# lsscsi
[0:0:0:0] disk SEAGATE ST200FM0073 0A04 /dev/sda 
[4:0:0:0] disk ATA ST3160812AS D /dev/sdb 
[7:0:0:0] disk Linux scsi_debug 0184 /dev/sdc

The three numbers in the "echo" are channel number, target number and lun, respectively. Wildcards (hyphen: "-") can be given for any or all of the three numbers.

# echo 3 > /sys/bus/pseudo/drivers/scsi_debug/max_luns
# echo 2 > /sys/bus/pseudo/drivers/scsi_debug/num_tgts
# echo "0 - -" > /sys/class/scsi_host/host7/scan

# lsscsi
[0:0:0:0] disk SEAGATE ST200FM0073 0A04 /dev/sda 
[4:0:0:0] disk ATA ST3160812AS D /dev/sdb 
[7:0:0:0] disk Linux scsi_debug 0184 /dev/sdc 
[7:0:0:1] disk Linux scsi_debug 0184 /dev/sdd 
[7:0:0:2] disk Linux scsi_debug 0184 /dev/sde 
[7:0:1:0] disk Linux scsi_debug 0184 /dev/sdf 
[7:0:1:1] disk Linux scsi_debug 0184 /dev/sdg 
[7:0:1:2] disk Linux scsi_debug 0184 /dev/sdh

The 'echo "0 - -" > scan' line above added five devices: /dev/sdd to /dev/sdh .

Extra hosts can be added and removed from the scsi_debug driver as follows:

# cd /sys/bus/pseudo/drivers/scsi_debug
# echo 1 > add_host  # add a new host (after the existing hosts)
# echo -2 > add_host # remove the last two hosts (if at least that many are present)

The scsi_debug driver does not have any limits on the number of scsi devices it can create. By default when loaded it has one scsi device (owned by a host). Larger numbers of devices can be introduced at load time by specifying the add_host, num_tgts and/or max_luns parameters, the number of scsi devices created is the product of the 3 parameters (they all default to 1). Alternatively sysfs can be used to add (or remove) scsi devices after the scsi_debug driver is loaded. Two strategies can be used:

increase the value of num_tgts or max_luns then use a line like 'echo "0 - -" > scan' (shown above) to a host already owned by the scsi_debug driver.
add more hosts with a line like 'echo 3 > add_host'. Each new host will create (num_tgts * max_luns) new scsi devices. Of course num_tgts or max_luns can be modified prior to calling 'echo 3 > add_host'.

Even though the scsi_debug can create ten thousand or more devices, it doesn't mean that the scsi mid-level, sd, sg, the block layer and various other kernel components will handle it gracefully.

Mode pages

The supported mode pages are listed following the MODE SENSE entry in the supported commands sections above. Prior to version 1.80, when a mode page is read no block descriptor is included in the response. From version 1.78 the MODE SELECT command is supported. Three mode pages can be modified:

caching (WCE field is changeable) [added in version 1.84]
control (D_SENSE field is acted upon)
informational exceptions control (MRIE and TEST fields are acted upon by REQUEST SENSE)

The saved pages are not supported, reflecting that the scsi_debug driver has only volatile storage. All fields can be changed, only those fields indicated above have side effects.

doublestore and VERIFY

Various users have asked for each scsi_debug device (i.e. "Logical Unit" (LU) in SCSI parlance) to have its own ram or backing store rather than all devices sharing the same ram. The answer has been: "look at tcm_loop" because this driver has been built to simulate thousands of hosts, targets and devices (LUs) without consuming the sort of resources that would usually imply.

That said, having only one ram image is a bit limiting. Having two ram images, shared between all the scsi_debug LUs (where the number of LUs is assumed to be greater than one), allows the correctness of copies to be checked. The SCSI VERIFY command simulation in version 1.89 of this driver has been "beefed up" from doing nothing and always returning a GOOD status, to doing a proper comparison of the data-out buffer against the ram disk when the BYTCHK field is set to 1. If that comparison fails, it stop and returns a CHECK CONDITION status, with a sense key of MISCOMPARE.

The following first loads the scsi_debug module with 4 LUs that share two slabs of 1 GiB, so /dev/sda and /dev/sdc share one, and /dev/sdb and /dev/sdd share the other. The logical block size defaults to 512 bytes and the ndelay=100000 random=1 means that the delay on each media command is uniformly distributed between a mzximum of 100 microseconds and 0 microseconds. dd is not too smart trying to write random data off the end of /dev/sdc but it does the job. Then there is a copy of the data from /dev/sg2 (aka /dev/sdc) to /dev/sg3 . Finally the --verify on the sg_dd utility (found in version 1.45 and later of the sg3_utils package) compares /dev/sg2 and /dev/sg3 using the SCSI VERIFY(BYTCHK=1) command.

# modprobe scsi_debug max_luns=4 dev_size_mb=1024 ndelay=100000 random=1 doublestore=1
# lsscsi -gs
[0:0:0:0]    disk    Linux    scsi_debug       0189  /dev/sda   /dev/sg0   1.07GB
[0:0:0:1]    disk    Linux    scsi_debug       0189  /dev/sdb   /dev/sg1   1.07GB
[0:0:0:2]    disk    Linux    scsi_debug       0189  /dev/sdc   /dev/sg2   1.07GB
[0:0:0:3]    disk    Linux    scsi_debug       0189  /dev/sdd   /dev/sg3   1.07GB
[N:0:1:1]    disk    INTEL SSDPEKKF256G7L__1                    /dev/nvme0n1  -           256GB

# dd if=/dev/urandom of=/dev/sdc
dd: writing to '/dev/sdc': No space left on device
2097153+0 records in
2097152+0 records out
1073741824 bytes (1.1 GB, 1.0 GiB) copied, 49.4036 s, 21.7 MB/s

# sg_dd if=/dev/sg2 of=/dev/sg3 bs=512
2097152+0 records in
2097152+0 records out

# sg_dd --verify if=/dev/sg2 of=/dev/sg3 bs=512
2097152+0 records in
2097152+0 records verified
root@xtwo70:~#

yyyyyyy

REPORT LUNS Well Known LU

There are two techniques for discovering the luns that a SCSI target supports. The first (and oldest) is based sending commands like INQUIRY and REPORT LUNS to lun 0, even if the target has no lun 0. The second technique is based on one of the so-called "well known logical units", specifically the REPORT LUNS well known logical unit. If present it must support the INQUIRY, REPORT LUNS, REQUEST SENSE and TEST UNIT READY command. Simulating one with scsi_debug is somewhat contorted:


# modprobe scsi_debug no_lun_0=1 max_luns=2
#
# lsscsi -g
[0:0:0:0] disk ATA INTEL SSDSC2BW18 DC32 /dev/sda /dev/sg0 
[3:0:0:1] disk Linux scsi_debug 0184 /dev/sdb /dev/sg1
#
# lsscsi --hosts
[0] ahci 
[1] ahci 
[2] ahci 
[3] scsi_debug
#

# ## Pick the host number corresponding to scsi_debug (i.e. "3")
#
# cd /sys/class/scsi_host/host3
# echo "- - 49409" > scan
#
# lsscsi -g
[0:0:0:0] disk ATA INTEL SSDSC2BW18 DC32 /dev/sda /dev/sg0 
[3:0:0:1] disk Linux scsi_debug 0184 /dev/sdb /dev/sg1 
[3:0:0:49409]wlun Linux scsi_debug 0184 - /dev/sg2

The scsi_debug driver needed to be told that it had no_lun_0 so it started generating luns at 1 ([3:0:0:1]) and then the scsi sub-system needed to be told to scan specifically for lun 49409 (0xc101). Thereafter the REPORT LUNS wlun appeared.

The way a SCSI initiator (host) scans for targets is transport specific. In the case of the scsi_debug driver it has a magic transport (bus) called "pseudo" which does the right thing. Apart from target discovery, the scsi_debug driver tries to simulate SAS devices, see the next section.

SAS personality

The scsi_debug driver has a Serial Attached SCSI (SAS) personality. For any application that cares, it looks like a dual ported SAS disk accessed via the primary port (relative target port 1). In one case it masquerades as a SATA disk behind a SCSI to ATA Translation (SAT) layer (SATL). Many of the settings are in common with Fibre Channel dual ported disks.

The driver sets the MULTIP (multiport) bit in the INQUIRY response. The following VPD pages are SAS or SAT specific:

device identification page [0x83] (yields naa-5 addresses for the lu, the accessing target port and the target device, plus some other designators)
SCSI ports [0x88] (shows the naa-5 addresses of both ports)
ATA information [0x89] (simulates a SATA disk in a SAS domain, defined in SAT)

The naa-5 addresses are meant to be world wide unique names which represents a challenge to the scsi_debug driver. Amongst other things Linux does not have a IEEE company id [memo: OSDL]. Even if it did, making them truly unique in a virtual driver, especially if multiple boxes could somehow see each other, would be difficult.

There are also several SAS specific mode pages:

protocol specific port page (SAS): short format page [0x19,0x0]
protocol specific port page (SAS): phy control and discover subpage [0x19,0x1]
protocol specific port page (SAS): shared mode subpage [0x19,0x2] (sas2 version)

Both the VPD and mode pages can be viewed from the user space with an application like sdparm . Below is an example of the device identification VPD page:


# sdparm -i /dev/sda
 /dev/sda: Linux scsi_debug 0004
Device identification VPD page:
 Addressed logical unit:
 desig_type: T10 vendor identification, code_set: ASCII
 vendor id: Linux
 vendor specific: scsi_debug 2000
 desig_type: NAA, code_set: Binary
 0x53333330000007d0
 Target port:
 desig_type: Relative target port, code_set: Binary
 transport: Serial Attached SCSI (SAS)
 Relative target port: 0x1
 desig_type: NAA, code_set: Binary
 transport: Serial Attached SCSI (SAS)
 0x52222220000007ce
 Target device that contains addressed lu:
 desig_type: NAA, code_set: Binary
 transport: Serial Attached SCSI (SAS)
 0x52222220000007cd
 desig_type: SCSI name string, code_set: UTF-8
 transport: Serial Attached SCSI (SAS)
 SCSI name string:
 naa.52222220000007CD

Below is an example of the SCSI ports VPD page showing a dual ported target:

# sdparm -i -p sp /dev/sda
 /dev/sda: Linux scsi_debug 0004
SCSI Ports VPD page:
Relative port=1
 Target port descriptor(s):
 desig_type: NAA, code_set: Binary
 transport: Serial Attached SCSI (SAS)
 0x52222220000007ce
Relative port=2
 Target port descriptor(s):
 desig_type: NAA, code_set: Binary
 transport: Serial Attached SCSI (SAS)
 0x52222220000007cf

Notice that the above implies that the INQUIRY was sent via port 1 (port A) of the emulated SAS dual ported target. The protocol specific port phy control and discover mode subpage [0x19,0x1] has target port/phy SAS addresses that correspond to the SCSI ports VPD page:

# sdparm -t sas -p pcd -l /dev/sda
 /dev/sda: Linux scsi_debug 0004
 Direct access device specific parameters: WP=0 DPOFUA=0
port: phy control and discover (SAS) mode page:
 PPID_1 6 [cha: n, def: 6] Port's (transport) protocol identifier
 NOP 2 [cha: n, def: 2] Number of phys
 PHID 0 [cha: n, def: 0] Phy identifier
 ADT 1 [cha: n, def: 1] Attached device type
 NPLR 9 [cha: n, def: 9] Negotiated physical link rate
 ASIP 1 [cha: n, def: 1] Attached SSP initiator port
 ATIP 0 [cha: n, def: 0] Attached STP initiator port
 AMIP 0 [cha: n, def: 0] Attached SMP initiator port
 ASTP 0 [cha: n, def: 0] Attached SSP target port
 ATTP 0 [cha: n, def: 0] Attached STP target port
 AMTP 0 [cha: n, def: 0] Attached SMP target port
 SASA 0x52222220000007ce [cha: n, def:0x52222220000007ce] SAS address
 ASASA 0x5111111000000001 [cha: n, def:0x5111111000000001] Attached SAS address
 APHID 2 [cha: n, def: 2] Attached phy identifier
 PMILR 8 [cha: n, def: 8] Programmed minimum link rate
 HMILR 8 [cha: n, def: 8] Hardware minimum link rate
 PMALR 9 [cha: n, def: 9] Programmed maximum link rate
 HMALR 9 [cha: n, def: 9] Hardware maximum link rate
 2_PHID 1 [cha: n, def: 1] Phy identifier
 2_ADT 1 [cha: n, def: 1] Attached device type
 2_NPLR 9 [cha: n, def: 9] Negotiated physical link rate
 2_ASIP 1 [cha: n, def: 1] Attached SSP initiator port
 2_ATIP 0 [cha: n, def: 0] Attached STP initiator port
 2_AMIP 0 [cha: n, def: 0] Attached SMP initiator port
 2_ASTP 0 [cha: n, def: 0] Attached SSP target port
 2_ATTP 0 [cha: n, def: 0] Attached STP target port
 2_AMTP 0 [cha: n, def: 0] Attached SMP target port
 2_SASA 0x52222220000007cf [cha: n, def:0x52222220000007cf] SAS address
 2_ASASA 0x5111111000000001 [cha: n, def:0x5111111000000001] Attached SAS address
 2_APHID 3 [cha: n, def: 3] Attached phy identifier
 2_PMILR 8 [cha: n, def: 8] Programmed minimum link rate
 2_HMILR 8 [cha: n, def: 8] Hardware minimum link rate
 2_PMALR 9 [cha: n, def: 9] Programmed maximum link rate
 2_HMALR 9 [cha: n, def: 9] Hardware maximum link rate

Other supported mode pages can be accessed in a similar way by the sdparm utility. Note that transport specific mode pages need the transport identified: hence the '-t sas' option above.

Downloads

There is nothing to download, see <linux_kernel_source>/drivers/scsi/scsi_debug.c .

Conclusion

Hopefully the design of the scsi_debug driver lends itself to many extensions. If you think that you have a useful extension that others may be interested in, please contact the linux-scsi list or the author with a patch.

Back to main page

Douglas Gilbert <dgilbert at interlog dot com>
with additions from
Martin K. Petersen <martin dot petersen at oracle dot com>

Last updated: 25th March 2020