Please note: on 25 March 2020 the contents of this page was moved to: scsi_debug.html on this site. This page will receive no further updates.
The scsi_debug adapter driver simulates a variable number of SCSI disks, each sharing a common amount of RAM allocated by the driver to act as (volatile) storage. With one SCSI disk simulated, the scsi_debug driver is functionally equivalent to a RAM disk. When multiple SCSI disks are simulated, they could be viewed as multiple paths to the same storage device or simply separate devices. The driver can also be used to simulate very large disks, 2 terabytes or more in size by "wrapping" its data access within the available ram.
A small but hopefully useful set of SCSI commands is supported along with some crude error checking. The number of simulated devices and the shared RAM size for storage can be given as module parameters or boot time parameters if the scsi_debug driver is built into the kernel. The number of simulated devices (and hosts) can be varied at run time via sysfs. Various error conditions can be optionally generated to test the reaction of upper levels of the kernel and applications to abnormal situations.
To create real SCSI targets and Logical Units for something like an iSCSI server, the reader should preferably be looking at the Linux target subsystem. In short: if you want create real SCSI devices then use the target subsystem; if you want to test or break something then read on.
This page describes the this driver as found in the Linux kernel version 4.18.0 and earlier versions of this driver worked with the Linux kernel 2.6 series. For information about the scsi_debug driver found in the lk 2.4 production series see this page.
The parameter name given in the table below is the module
parameter name and the sysfs file name. The boot time parameter (if
the scsi_debug driver is built into the kernel (not recommended)) has
"scsi_debug." prepended to it. Hence the boot time
parameter corresponding to add_host=2
is scsi_debug.add_host=2
.
When the scsi_debug module is loaded, many parameters
can be given on the command line, separated by spaces: for example to
simulate 140 disks "modprobe scsi_debug
max_luns=2
num_tgts=7
add_host=10"
could be used. This will generate 140 devices: 10 hosts, each with 7
targets, each with 2 logical units.
Sysfs parameters can
be read with the cat command and
written with the echo command. Sysfs
expects a driver to be associated with a bus (e.g. PCI) so the
"pseudo" bus was created for drivers like scsi_debug. An
example:
# cd
/sys/bus/pseudo/drivers/scsi_debug
# cat dev_size_mb
8
#
echo 1 > add_host
#
echo 64 > dev_size_mb
In
the /sys/module/scsi_debug/parameters
directory the parameters used when the scsi_debug module was started
(or their default values) are listed. Even though some of those
parameters are have writeable permissions, writing to them had no
effect on the driver. Note that some parameters appear in this
directory but not in the /sys/bus/pseudo/drivers/scsi_debug
directory. If the sysfs access cell for a parameter in the
following table is blank then it doesn't appear in the
/sys/bus/pseudo/drivers/scsi_debug directory
but does appear in the /sys/module/scsi_debug/parameters
directory.
Here is a list of scsi_debug specific
driver parameters:
Parameter name |
default value |
sysfs access |
sysfs write effect |
new in version |
notes |
add_host |
1 |
read-write |
immediate |
|
can add or remove hosts at runtime |
ato |
1 |
read only |
- |
1.81 |
application tag ownership (0 -> disk, 1 -> host) |
cdb_len |
10 |
read-write |
next command |
0187 |
6, 10, 12, 16 and 32 accepted, other numbers treated same as 10. Size of READs, WRITEs and MODE SENSEs generated by the sd driver for the block layer. When 32 is given, it is treated as if 16 was given. |
clustering |
0 |
|
|
1.84 |
enable large transfers |
delay |
1 |
read-write |
next command |
|
IO command response delay: units are jiffies (configurable: 1 to 10 ms) . 0: no delay, all in one thread; -1: use "hi" tasklet; -2: use normal tasklet |
dev_size_mb |
8 |
read only |
|
|
units are Mebibytes (2**20 bytes) |
dif |
0 |
read-only |
|
1.81 |
data integrity field type [T10: protection type] |
dix |
0 |
read-only |
|
1.81 |
data integrity extension mask; check integrity when non zero |
doublestore |
0 |
read-write |
|
1.88 |
when 0 (its default) then one data store (of dev_size_mb) is shared by every scsi_debug device. When set to 1 two data stores are created and they are allocated to scsi_debug devices in an alternating fashion. Note that each scsi_debug device has its own metadata (e.g. start/stop state and Unit Attention state). |
dsense |
0 |
read-write |
immediate |
1.81 |
0 -> fixed; 1-> descriptor sense format |
every_nth |
0 |
read-write |
n commands from now |
|
for error injection: 0 -> don't do error injection. When non zero (it can be negative) statistics parameter will be set to 1 if it isn't already. |
fake_rw |
0 |
read-write |
next command |
1.80 |
when set does no processing when a READ or WRITE command (of any cdb size) is received. When fake_rw=1 no ram is allocated. |
guard |
0 |
read-only |
|
1.81 |
protection checksum: 0 -> crc; 1 -> ip |
host_lock |
0 |
read-write |
next command |
1.84, 1.88 |
when set wraps each submitted command a host_lock which is detrimental in a multi-queue system. From version 1.88 this parameter si ignored. |
inq_product |
"scsi_debug" |
|
|
0187 |
user can set in module start parameters, 16 bytes long, pad spaces to the right to comply with SPC requirements. |
inq_rev |
driver_ver |
|
|
0187 |
For example: "0187". Dropped decimal point and added leading "0" in this version. User can set in module start parameters, 4 bytes long |
inq_vendor |
"Linux" |
|
|
0187 |
user can set in module start parameters, 8 bytes long, pad spaces to the right to comply with SPC requirements. |
lbprz |
1 |
|
|
2012 |
LB provisioning: returns 0s when reading unmapped block |
lbpu |
0 |
|
|
2012 |
LB provisioning: support UNMAP |
lbpws |
0 |
|
|
2012 |
LB provisioning: support WRITE SAME(16) and UNMAP |
lbpws10 |
0 |
|
|
2012 |
LB provisioning: support WRITE SAME(10) |
lowest_aligned |
0 |
|
|
1.81 |
RCAP_16's lowest aligned logical block address (max: 0x3fff) |
map |
|
read-only |
|
|
when logical block provisioning is active, it shows the internal provisioning map. Otherwise it shows '0-<sdebug_store_sectors>'. |
max_luns |
1 |
read-write |
next positive add_host or scan |
|
responds to luns: 0 ... (max_luns-1) or 1 ... (max_luns-1) if no_lun_0 is set. In 1.86 the max_lun maximum increased from 31 to 256. Uses LUN peripheral device addressing format (address_method=0, bus_identifier=0). |
max_queue |
192 |
read-write |
next command |
1.82 |
number of commands driver can queue before telling mid-level it is full. Safe to change when commands already queued. |
medium_error_count |
10 |
read-write |
|
0188 |
only active when bit 1 (0x2) of the opts parameter is set |
medium_error_start |
0x1234 |
read-write |
|
0188 |
only active when bit 1 (0x2) of the opts parameter is set |
ndelay |
0 |
read-write |
next command |
1.84 |
IO command response delay: units are nanoseconds. If > 0 then the delay parameter will be ignored (it appears as -9999) |
no_lun_0 |
0 |
read-write |
next positive add_host or scan |
1.77 |
no lun 0 but responds to INQUIRY and REPORT LUNS as per SPC-2 |
no_uld |
0 |
read only |
|
1.82 |
device (LUs) created by this driver will only attach to sg and bsg devices. So depending one their ptype (peripheral device type) there will be no corresponding /dev/sd*, /dev/sr*, /dev/st* or /dev/ses* device nodes. There will be /dev/sg* and /dev/bsg/<h:c:t:l> device nodes. |
num_parts |
0 |
read only |
|
|
number of partitions |
num_tgts |
1 |
read-write |
next positive add_host or scan |
|
targets per host |
opt_blks |
64 |
|
|
1.84 |
'Optimal transfer length' field in Block Limits VPD page |
opt_xferlen_exp |
physblk_exp |
|
|
1.84 |
Controls 'Optimal transfer length granularity' field in Block Limits VPD page |
opts |
0 |
read-write |
usually following commands |
|
0 -> quiet and no error injection |
physblk_exp |
0 |
|
|
1.81 |
2**physblk_exp sets READ CAPACITY(16)'s logical blocks per physical block exponent field |
ptype |
0 |
read-write |
next positive add_host or scan |
|
peripheral device type (0==disk) |
random |
0 |
read-write |
|
1.89 |
when 0 (the default) delays the response of media access SCSI commands as precisely as possible to the duration indicated by the delay and ndelay parameters. When 1 it chooses a delay from a uniform distribution from no delay (0) to the duration indicated by the delay and ndelay parameters. When multiple threads are issuing commands random=1 can be used to simulate out-of-order responses. |
removable |
0 |
read-write |
|
|
When non-zero sets the RMB bit in the INQUIRY response indicating the device is removable |
scsi_level |
7 |
read only |
|
|
from: 0 (no compliance), 1, 2 (SCSI-2), 3 (SPC), 4 (SPC-2), 5 (SPC-3), 6 (SPC-4), 7 (SPC-5) |
sector_size |
512 |
read only |
|
1.81 |
logical block size in bytes. 512, 1024, 2048 and 4096 accepted |
statistics |
0 |
read-write |
next command |
1.86 |
collect statistics that are output by 'cat /proc/scsi/scsi_debug/<host_num>'. Needs kernel built with CONFIG_SCSI_PROC_FS selected. Due to sysfs policy this is not superseded by sysfs. [Don't believe everything you read in kernel config menus.] |
strict |
0 |
read-write |
next command |
1.85 |
check for bits set in the reserved part of SCSI command blocks. If found report with the position of the first offending bit. |
submit_queues |
1 |
read-only |
|
0187 |
multi-queue setting from 1 (i.e. non-mq) to <= number of processors on the machine |
unmap_alignment |
0 |
|
|
1.81 |
Block limits VPD page's unmap granularity alignment |
unmap_granularity |
1 |
|
|
1.81 |
Block limits VPD page's optimal unmap granularity |
unmap_max_blocks |
0xffffffff |
|
|
1.81 |
Block limits VPD page's maximum unmap LBA count |
unmap_max_desc |
255 |
|
|
1.81 |
Block limits VPD page's maximum unmap block descriptor count |
uuid_ctl |
0 |
read only |
|
1.86 |
if 1 then each LU name is an internally generated UUID; if 2 then all LUs shared the same UUID and if 0 then the LU name is a locally assigned NAA |
virtual_gb |
0 |
read-write |
immediate, next READ CAPACITY |
1.79 |
When 0 then device is dev_size_mb sized ram disk. When n > 0, "virtual" n Gibibyte size disk, wrapping on dev_size_mb actual ram. The Gibibyte unit is 2**30 bytes |
vpd_use_hostno |
1 |
read-write |
next positive add_host or scan |
1.80 |
the driver generates serial numbers and SAS naa-5 addresses based on host number ("hostno"), target id and lun. When set to 0, the generated numbers ignore "hostno". |
wp |
0 |
read-write |
|
1.89 |
This parameter is for Write Protection. When 0 (the default) store modifying data access commands are permitted. When 1 store modifying data access commands are not allowed. |
write_same_length |
0xffff |
|
|
2012 |
maximum blocks per WRITE SAME command |
zbc |
0 ['none'] |
read-only |
|
1.89 |
The default is 0 or the string 'none'. To specify host-aware scsi_debug devices use 1 or 'host-aware'; this is currently not implemented and does nothing. To specify host-managed scsi_debug devices use 2 or 'host-managed' which will set the ptype (i.e. the Peripheral Device Type (pdt)) to 0x14. The two latter strings can be shortened to 'aware' and 'managed'. After the scsi_debug module is loaded with 'zbc=managed' say, using sysfs to change this parameter to 'none' will turn all scsi_debug devices into normal disk simulations with a pdt of 0x0. |
zone_max_open |
8 |
|
|
1.89 |
|
zone_nr_conv |
|
|
|
1.89 |
|
zone_size_mb |
|
|
|
1.89 |
|
The add_host
parameter is the number of hosts (HBAs) to simulate. The default is
1. For boot time and module loads the allowable values are 0 through
to a large positive number. For sysfs writes, a value of 0 does
nothing while a positive number adds that many hosts and a negative
number removes that number of hosts. A sysfs read of this parameter
shows the current number of hosts scsi_debug is simulating. No more
than num_tgts target ids will be
used per host. Target ids are in ascending order from 0 excluding the
target id that is used by the initiator (i.e. HBA) if any. The
default setting of num_tgts is 1.
The default setting for max_luns
is 1. So the number of pseudo disks simulated at driver
initialization time is (add_host
* num_tgts * max_luns).
Note that if any of these three parameters is set to zero at kernel
boot time or module load time then no devices are created. Modifying
the add_host
parameter in sysfs can be used to simulate hot plugging and
unplugging of hosts. See below for adding and deleting individual
scsi devices
The ato
parameter sets the field of the same name in the control mode page.
The default value is 1 which implies the host is the application tag
owner. A value of 0 implies the device server (e.g. the (pseudo)
disk) is the application tag owner.
The cdb_len
parameter controls the SCSI cdb lengths generated by the sd driver
typically when it receives requests from the block layer. There are 3
bool internal variables: use_10_for_rw, use_16_for_rw and
use_10_for_ms. "ms" is MODE SENSE/SELECT whose cdb can be 6
or 10 bytes long. If both use_10_for_rw and use_16_for_rw are false
then READ(6) or WRITE(6) is used if the LBA and the number_of_blocks
are not too large. This parameter can have these settings:
6: use_10_for_rw=false, use_16_for_rw=false, use_10_for_ms=false: try to use READ(6), WRITE(6) and MODE SENSE(6)
10: use_10_for_rw=true, use_16_for_rw=false, use_10_for_ms=false: try to use READ(10), WRITE(10) and MODE SENSE(6)
12: use_10_for_rw=true, use_16_for_rw=false, use_10_for_ms=true: try to use READ(10), WRITE(10) and MODE SENSE(10)
16: use_10_for_rw=false, use_16_for_rw=true, use_10_for_ms=true: try to use READ(16), WRITE(16) and MODE SENSE(10)
others: mapped to 10
Note that this parameter has no control over the sd driver's use
of READ(32) and WRITE(32) commands which are generated for some
settings of Protection Information (PI).
The clustering
parameter informs the SCSI mid layer whether (1) or not (0)
clustering is enabled. The default is that is not (0) enabled.
Setting this parameter facilitates large transfers of data with a
single command.
The delay parameter is the number of jiffies by which the driver will delay responses. The default is 1 jiffy unless the ndelay parameter is given, see its description. Setting this parameter to 0 will cause the response to be sent back to the mid level before the request function is completed. The "jiffy" is a kernel space jiffy (typically the largest HZ figure yields a 1 millisecond on i386) rather than a user space jiffy (USER_HZ is typically 10 milliseconds on i386). HZ and USER_HZ are configurable in the kernel build. Both delayed and immediate responses are permitted however delayed responses are more realistic. For delayed responses, a kernel timer is used. [Real adapters would generate an interrupt when the response was ready (i.e. the command had completed).] For a fast ram disk set the delay parameter to 0. These SCSI commands ignore the delay parameter and respond immediately: INQUIRY, REPORT LUNS, REQUEST SENSE, SYNCHRONIZE CACHE plus various other non "media access" commands. TEST UNIT READY is considered a media access command.
The delay parameter may be set to -1 or -2 which uses a kernel tasklet to generate a more or less immediate response (but in a different kernel thread). The -1 variant schedules a high priority tasklet while -2 schedules a normal priority tasklet. Trying to write a new value to delay while there are queued command responses may result in an EBUSY error.
The Start
The dev_size_mb parameter allows the user to specify the size of the simulated storage. The unit is Mebibytes (each 2**20 bytes and a bit larger than a Megabyte) and the default value is 8. The maximum value depends on the capabilities of the vmalloc() call on the target architecture. If the module fails to load with a "cannot allocate memory" message then a "vmalloc=nn{KMG}" boot time argument may be needed. [See the kernel source file: Documentation/kernel-parameters.txt for more information on this.] The RAM reserved for storage is initialized to zeros which leads the sd (scsi disk) driver and the block layer to believe there is no partition table present. Partitions can be simulated with num_parts (see below). All simulated dummy devices share the same RAM. If a value of 0 or less is given then dev_size_mb is forced to 1 so 1 MB of RAM is used. Given 512 byte logical blocks, the largest ramdisk that can be allocated is 2 TB but it is unlikely a system would be able to allocate that much ram (a situation that would be bypassed if fake_rw=1). Very large amounts of "virtual" storage can be simulated with the virtual_gb parameter (see below).
The dif parameter sets the T10 protection type which is a value between 0 and 3 where 0 (the default) is no protection. Protection information is extra bytes of data (typically 8) associated with blocks of data transferred between a SCSI initiator and a SCSI block logical unit (as defined in T10 SBC standards). T10 protection information is often called the "data integrity field" hence the name DIF. For information about DIF and DIX see https://oss.oracle.com/projects/data-integrity/documentation/ .
The dix parameter when set causes protection information to be carried between the operating system and the SCSI initiator. DIX is an abbreviation of "data integrity eXtension" and can be viewed as a front end to DIF. When its value is zero (the default) then no protection information is carried within the operating system. When the dix parameter is a non zero value then the the dix type will be the same as the dif parameter. So if dif=2 and dix=1 then both DIF and DIX are set to type 2 protection. Note that if dif=0 it doesn't matter what the dix parameter is, both DIF and DIX are set to type 0 protection (which is no protection).
The every_nth parameter takes a decimal number as an argument. When this number is greater than zero, then incoming commands are counted and when <n> is reached then the associated command generates some sort of error. Currently the available errors are timeout (when "opts & 4" is true) and RECOVERED_ERROR (when "opts & 8" is true) . Once the command count reaches <n> then it is reset to zero. For example setting every_nth to 3 and opts to 4 will cause every third command to be ignored (and hence a timeout). If every_nth is not given it is defaulted to 0 and timeouts and recovered errors will not be generated. Note that for the "every nth" mechanism to work the statistics parameter needs to be set.
If every_nth is negative then an internal command counter counts down to that value and when it is reached, continually generates the error condition (specified in opts) on each newly received command. The driver flags this continual error state by setting every_nth to -1 . The user can stop error conditions being generated on receipt of every subsequent command by writing 0 to every_nth (or opts ).
The fake_rw parameter instructs the scsi_debug driver to ignore all READ and WRITE commands and return a GOOD status. This means the data "read" when fake_rw is set is whatever was previously in the scatter gather list. The default value is 0 (i.e. process READ and WRITE commands). This parameter is for testing and when set can confuse the kernel or utilities that look for partitions and other information on a "disk".
The guard parameter when set to zero (the default) use T10 defined CRC in the protection information. When set to one the IP (internet protocol) checksum (as used by iSCSI ?) is used.
The host_lock parameter indicates whether each command (excluding its response delay and associated callback into the mid-layer) is surrounded by a per host host_lock (which is a kernel "spin lock"). In a SCSI multi-queue system the presence of this host lock will have the effect of serializing all commands form a host; and that is detrimental to system performance. Prior to version 1.84 this parameter was not available and the host_lock surround all commands. In version 1.84 and later the default is 0 which means the host_lock is not applied. Set host_lock=1 for the old behaviour. In version 1.88 this functionality (i.e. the host_lock) was removed and setting this parameter has no effect. It is kept so that scripts that call it will not break.
The inq_product parameter is the 16 byte ASCII string (left justified, space characters to the right) that get reported by this driver's standard INQUIRY response. The default is "scsi_debug ".
The inq_rev parameter is the 4 byte ASCII string (left justified, space characters to the right) that get reported by this driver's standard INQUIRY response. The driver version number (was "1.86") has been reformatted to be suitable for this field. The default value is now "0187" and will increase as changes are added to this driver.
The inq_vendor parameter is the 8 byte ASCII string (left justified, space characters to the right) that get reported by this driver's standard INQUIRY response. The driver is "Linux ".
The lbpu parameter, if set, causes the logical block partitioning VPD page to set the field of the same name. The default is to set the LBPU field to 0. When set this field indicates the UNMAP command is supported.
The lbpws and lbpws10 parameters cause the corresponding bits in the logical block partitioning VPD page to be set. The imply the the UNMAP field within the WRITE SAME(16) and WRITE_SAME(10) respectively are supported.
The lbprz parameter, if set, causes the logical block partitioning VPD page to set the field of the same name. When this field is set reading unmapped logical blocks will yield block(s) of data full of xeros to be returned.
The lowest_aligned parameter
sets the field called LOWEST ALIGNED LOGICAL BLOCK ADDRESS in the
READ CAPACITY (16) command response.
The default is zero which
implies the logical block size and the physical block size are the
same.
The max_luns parameter allows an upper limit to be placed on the logical unit number (lun) that the scsi_debug driver will respond to. A value of 2 means that this driver will respond to logical unit numbers 0 and 1. If max_luns is modified by a sysfs write then the scsi_debug driver modifies the scsi_host::max_lun member of all hosts that it owns. When max_luns is modified by a sysfs write then it will take effect the next time a host is added (see add_host) or when a scan is done on any existing host. The mid level scanning code will scan for up to but not including max_scsi_luns which is a SCSI mid level boot and module load time parameter.
The max_queue parameter indicates the maximum number of queued responses the driver can handle. This defaults to an internal define in the scsi_debug driver called SCSI_DEBUG_CANQUEUE which is currently 192 (on 64 bit machines, 96 or 32 bit machines). If both the delay and ndelay parameters are 0, no commands have queued responses. If there is an attempt to exceed this value then either SCSI_MLQUEUE_HOST_BUSY is returned to the mid-layer (the default) or a status of TASK_SET_FULL (if the 0x200 opts mask is set). Sysfs can be used at any time to change the value of max_queue, even when the are queued command responses.
The medium_error_count parameter indicates the number of blocks, including the medium_error_start LBA, on which to yield a SCSI MEDIUM ERROR sense key. This only occurs when the opts parameter has its bit 1 (i.e. 0x2) set. Its default value is 10.
The medium_error_start parameter indicates the first LBA to yield a SCSI MEDIUM ERROR sense key. This only occurs when the opts parameter has its bit 1 (i.e. 0x2) set. Its default value is 0x1234 (4660 in decimal).
The ndelay parameter is the response delay whose units are nanoseconds. This mechanism depends on high resolution timers in the kernel which may not be supported on small or old system (it is a kernel build config option). Its default value is 0 which means the delay parameter is operative. If ndelay is a positive value then a response delay for that many nanoseconds is active (and to indicate the delay parameter is overridden, it is set to -9999). Depending on the hardware, setting ndelay to less than a few microseconds probably causes no further reduction in the observed response delays. Trying to write a new value to ndelay while there are queued command responses may result in an EBUSY error.
The no_lun_0 parameter when set to a non zero value causes a lun 0 INQUIRY response of peripheral_qualifier==3 indicating there is no actual lu there. As required by SPC, lun 0 will still respond to the a REPORT LUNS command. If the REPORT LUNS has a 'select report' code of 1 or 2, then one of the luns reported will be the REPORT LUNS well known logical unit (lun 49409 or 0xc101). The default value is 0. If max_luns is greater than 1, the the first lun generated by scsi_debug will be lun 1 (since lun 0 is skipped). The REPORT LUNS well known logical unit (wlun) only supports the INQUIRY, REPORT LUNS, REQUEST SENSE and TEST UNIT READY SCSI commands. To make this wlun appear as a scsi generic (sg) device see the REPORT LUNS well known LUN example below.
The num_parts parameter writes
a partition table to the ramdisk if the parameter's value is greater
than 0. The default is 0 so in that case the ramdisk is simply all
zeros. When num_parts is greater
than zero a DOS format primary partition block is written to logical
block 0, so the number of partitions is limited to a maximum of 4.
The partitions are given an id of 0x83 which is a "Linux"
partition. The available space on the ramdisk is roughly divided
evenly between partitions when 2 or more partitions are requested.
The partitions are not initialized with any file system. Even
if no partitions are specified, a utility like fdisk can be
used to added them later.
The num_tgts
parameter allows the number of targets per host to be
specified. It should be 0 or greater. Target id numbers start
at 0 and ascend, bypassing the target id of the initiator (i.e. the
HBA). If num_tgts is modified by
a sysfs write then the scsi_debug driver modifies the
scsi_host::max_id member of all hosts
that it owns. When num_tgts
is modified by a sysfs write then it will take effect the next time a
host is added (see add_host) or
when a scan is done on any existing host.
The opt_blks
parameter is placed in the "Optimal transfer length" field
of the Block Limits VPD page. Its default value is 64.
The
opt_xferlen_exp parameter (with
help from the physblk_exp
parameter) controls the "Optimal transfer length granularity"
field (OTLG) in the Block Limits VPD page. If 0 (default) or less
than, or equal to, physblk_exp
then the OTLG field is set to 2**physblk_exp
making physblk_exp the effective default
value. Otherwise, if this parameter is greater than physblk_exp
then the OTLG field is set to 2**opt_xferlen_exp
.
The opts parameter takes a number as an argument which is the bitwise "or" of several flags. The flags that mention "nth" are only active when every_nth != 0 . So-called "read-write" commands include some others such as VERIFY. The flags supported are:
1 - "noisy" flag: all calls to entry points of driver are logged. Commands to be executed are shown in hex. Additional information such as check conditions, command aborts and resets are logged
2 - "medium error" flag: simulates a SCSI MEDIUM ERROR when LBA medium_error_start (default: 0x1234 (4660 in decimal)) is read. The following medium_error_count blocks (default: 10 blocks) less 1 also yield a medium error.
4 - ignore "nth" command causing a timeout.
8 - cause "nth" read or write command to yield a RECOVERED_ERROR.
0x10 - cause "nth" read-write command to yield an ABORTED_COMMAND (ack/nak timeout) which is a SAS transport error.
0x20 - cause "nth" read-write command to yield an ABORTED_COMMAND (logical block guard check failed), nominally a DIF (Protection Information) error
0x40 - cause "nth" read-write command to yield an ABORTED_COMMAND (logical block guard check failed), nominally a DIX error
0x80 - ignore "nth" media access command causing a timeout
0x100 - cause "nth" read command to yield half the data it was requested to read
0x200 - log generation of TASK SET FULL and host busy plus changes to queue depth and type
0x400 - if max_queue is exceeded yield a TEST SET FULL (default: host busy)
0x800 - cause "nth" read-write command whose queue_depth is at it maximum value to yield a status of TASK SET FULL
0x1000 - set WCE field in the caching page to 0 (default WCE=1)
0x2000 - log only abort commands and the various levels of reset
0x4000 - used together with the noisy flag (1) to suppress the logging of cdbs; additional information (if any) is still logged.
0x8000 - cause "nth" read or write command to yield a
"host busy" (mid-level sent SCSI_MLQUEUE_HOST_BUSY)
0x10000
- cause "nth" read-write command to be aborted (via a call
to block layer)
The opts "noisy" (or
debug) flag will cause all scsi_debug entry points to be logged in
the system log (and often sent to the console depending on how kernel
informational messages are processed). With this flag set commands
are listed in hex and if they yield a result other than successful
then that is shown. In a busy system this may prove to be too much
log "noise" in which case this combination of flags may be
useful: opts=0x6201 .
The
opts "medium error"
flag will cause any read command whose LBA start at
medium_error_start (default:
0x1234 (4660 in decimal)) for medium_error_count
blocks to return a medium error indication to the mid level. The
"ignore nth" flag is only active when every_nth
!= 0 . When an internal command counter reaches the value in
every_nth and the "ignore
nth" flag is set, then this command is ignored (i.e. quietly not
processed). Typically this will cause the SCSI mid level code to
timeout the command which leads to further error processing. The
internal command counter is reset to zero whenever opts
is written to, whenever every_nth
is written to, when the every_nth
value is reached and at driver load time. The "recovered error"
flag works in a similar fashion to the "ignore nth"
flag, however when the every_nth
value is reached and it is either a read or a write command then the
command is processed normally but yields a "recovered error"
indication. Such an indication is _not_ a hard error but for a real
disk could indicate deteriorating media. The "aborted command"
flag injects a transport error in a similar fashion to the way the
"recovered error" flag works. A minor point: the kernel
boot time and module load time opts
parameter is a decimal integer. However the output sysfs value is a
hexadecimal number (output as 0x9 for example) while the input value
is interpreted as hexadecimal if prefixed by "0x" and
decimal otherwise. When combining these flags it is easier to
consider them as hexadecimal numbers.
The physblk_exp
parameter becomes the "Logical blocks per physical block
exponent" field in the READ CAPACITY (16) response. The default
value is 0 which means the logical block and physical block sizes are
the same.
The ptype
parameter allows the SCSI peripheral type to be set or modified. The
default value is 0 which corresponds to a disk. Other useful
peripheral types are 1 for tape, 3 for processor, 5 for dvd/cd and 13
for enclosure (SES).
The scsi_level
parameter is the ANSI SCSI standard level that the simulated disk
announces that it is compliant to. The INQUIRY response which
is generated by scsi_debug contains the ANSI SCSI standard level
value (in byte 2).
The sector_size
parameter (default 512) is the logical block size in bytes (assuming
ptype=0 which means a block storage
device).
The statistics
parameter controls whether several internal counters are incremented
or not. For speed the default is 0 (i.e. don't collect statistics).
The "every nth" mechanism requires those internal counters
so specifying a non-zero every_nth
parameter will cause the statistics collection to be turned on.
The
strict parameter can be 1 or 0
(the default). If 1 then it uses the cdb mask given in the REPORT
SUPPORTED OPERATION CODES command to check each command cdb received
by this driver. If any bit is set in the cdb but the corresponding
bit is not set in the mask, then the command is rejected with a
status of CHECK CONDITION, a sense key of ILLEGAL REQUEST and
additional sense of INVALID FIELD in CDB. The sense data also points
to the byte and bit position in the cdb that first failed the mask
comparison. Byte long (and longer) fields will always point at bit 7
as failed. Each cdb is scanned in ascending byte order.
The
submit_queue parameter sets the number of submission queues
the SCSI multi-queue logic will maintain for this driver. The default
value is 1 which implies no multi-queue. If a value is given that
exceeds the number of processors on the machine then the value used
will be the number of processors on the machine. A warning is issued
to the log if the driver reduces this value.
The uuid_ctl
parameter controls whether a locally assigned NAA (64 bit value) is
used to identify each logical unit (LU) simulated by this driver, or
if a UUID (128 bit, RFC 4122) is used. If the value is 0 (the
default) a locally assigned NAA is used. If the value is 1 then a new
UUID (effectively a random value) is generated for each LU. If the
value is 2 then the same generated UUID is used for all LUs simulated
by this driver.
The virtual_gb parameter allows the
scsi_debug driver to simulate a much larger storage device than
physical RAM available in the machine. When the virtual_gb
parameter is 0 (its default value) then the maximum storage available
is that indicated by the dev_size_mb
parameter. When the virtual_gb parameter is greater than zero,
that many Gibibytes (each of 2**30 bytes and larger than a Gigabyte)
are reported by the READ CAPACITY command. Reading and writing of the
"Gigabytes" of data wraps around within the available
physical ram (which the scsi_debug driver has allocated and is
dev_size_mb Mebibytes in size).
When the number of virtual Gibibytes is 2048 or greater then READ
CAPACITY (16) is needed to represent the size and READ (16) and/or
WRITE (16) are needed to access data at the 2048 Gibibyte boundary
and beyond. This boundary represents 2**32-1 blocks (sectors)
assuming 512 bytes long. The "wrapping" action still allows
partitions to be written with fdisk and in many cases a file system
to be initialized. Trying to store and retrieve any useful data on
such a big virtual disk would not be wise! Setting the
dev_size_mb parameter to a prime
number, larger than the default value (which is 8) and that doesn't
starve the machine for resources, seems to help in creating ext3 file
systems. This occurs since mkfs writes the file system super block at
several offsets within the partition, and the wrap may cause the file
system header to be overwritten. The virtual_gb
option is designed for testing, not practical data storage.
The
vpd_use_hostno parameter affects the way the scsi_debug driver
generates its serial numbers, SAS and naa-5 addresses. When
vpd_use_hostno is set to 1 (its default value) then the host
number ("hostno"), target_id and lun are used to generate
the serial number, SAS and naa-5 addresses. The formula is "((hostno
+ 1) * 2000) + (target_id * 1000) + lun)". When
vpd_use_hostno is set to 0 then the "hostno" term in
the formula is set to 0. This has the affect of making multiple
simulated hosts look like they are connected to the same drives (i.e.
there are only "num_tgts *
max_luns" unique simulated
devices). The kernel will still report "add_host
* num_tgts * max_luns"
devices but higher level multipath aware software may see the
difference
Below is a list of supported commands. Some do nothing (e.g. SYNCHRONIZE CACHE). Those that have interesting functionality have notes in brackets. If the feature was introduced in a recent version (i.e. since 1.76) then that is noted.
CLOSE ZONE [added in 1.89]
COMPARE AND WRITE [1.85: added]
FINISH ZONE [added in 1.89]
GET LBA STATUS
INQUIRY [vital product data pages: 0, 0x80, 0x83] [1.77: VPD pages: 0x85, 0x86, 0x87, 0x88, 0x89, 0xb0] [1.87: VPD pages: 0x84, 0xb1, 0xb2] [1.89: VPD page: 0xb6]
LOG SENSE [1.78: temperature(0xd) and informational exceptions(0x2f)] [1.80: support log subpages]
MODE SELECT (6), MODE SELECT (10) [1.84: changeable pages: 0x8 (caching), 0xa (control) and 0x1c (informational exceptions)]
MODE SENSE (6), MODE_SENSE (10) [sense pages: 1 (rw error recovery), 2 (disconnect), 3 (format), 8 (caching), 0xa (control), 0x1c (informational exceptions), 0x3f (read all)] [1.77: subpage support plus SAS pages: 0x19,0 0x19,1 and 0x19,2]
OPEN ZONE [added in 1.89]
PRE-FETCH(10), PRE-FETCH(16) [added in 1.89]
PREVENT ALLOW MEDIUM REMOVAL
READ (6), READ (10), READ(12), READ(16), READ(32)
READ CAPACITY (10), READ CAPACITY (16) [1.79: added 16 byte command]
RELEASE (6), RELEASE (10)
REPORT LUNS [1.77: shows REPORT LUNS wlun]
REPORT REALMS [added in 1.89, not implemented]
REPORT SUPPORTED OPERATION CODES [1.85: added]
REPORT SUPPORTED TASK MANAGEMENT FUNCTIONS [1.85: added]
REPORT TARGET PORT GROUPS
REPORT ZONES [added in 1.89]
REQUEST SENSE [1.79: shows MRIE=6 failure prediction, power states]
RESERVE (6), RESERVE (10)
RESET WRITE POINTER [added in 1.89]
REZERO UNIT (which is REWIND for tapes)
SEND DIAGNOSTIC [1.78: maintains start and stop states, when stopped fails media access commands]
SYNCHRONIZE CACHE (10, 16)
TEST UNIT READY [1.78: in stopped state gives appropriate error]
UNMAP
VERIFY (10), VERIFY(16) supporting BYTCHK=1 or 3 [added in 1.89]
WRITE (6), WRITE (10), WRITE (12), WRITE (16), WRITE(32)
WRITE BUFFER
WRITE SAME(10), WRITE SAME(16)
WRITE SCATTERED(16, 32)
UNMAP
<< XDWRITEREAD (10) [which is a bidirectional command] [removed around lk 5.0] >>
The implementations of the above commands are sufficient for the scsi subsystem to detect and attach devices. The fdisk, e2fsck and mount commands also work as do the utilities found in the sg3_utils package (see the main page). Crude error processing picks up unsupported commands and attempts to read or write outside the available RAM storage area.
Modern SCSI devices use vital product page 0x83 for identification. This driver yields both "T10 vendor identification" and "NAA" descriptors. The former yields an ASCII string like "Linux scsi_debug 4000" where the "4000" is the ((host_no + 1) * 2000) + (target_id * 1000) + lun). In this case "4000" corresponds to host_no==1, target_id==0 and lun==0. The "NAA-5" descriptor is an 8 byte binary value that looks like this hex sequence: "51 23 45 60 00 00 0f a0" where the IEEE company id is 0x123456 (fake) and the vendor specific identifier in the least significant bytes is 4000 (which is fa0 in hex). [The "4000" is derived the same way for both descriptors.]
Read and write commands executed by the scsi_debug driver are atomic (i.e. a write to one scsi_debug device will not interrupt (split) a read from another scsi_debug device. So a read command will either yield the contents of ram before a co-incident write, or after the co-incident write has finished.
The START STOP UNIT (SSU) and SYNCHRONIZE CACHE (SC) commands have special longer delay processing from version 0188 onward. For both commands if ndelay <= 10,000 (10 microseconds) then long delays are ignored. Otherwise SSU has at least a 1 second delay and if delay > 1 then its delay is that many seconds. And for SC its longer delay is 1/20 that of SSU (e.g. if delay=2 then SSU's delay is 2 seconds and SC's delay is 100 milliseconds).
scsi-debug supports emulating devices with logical block sizes
bigger than 512 bytes. This can be specified using the sector_size
option.
Some storage devices use physical block sizes
bigger than 512 bytes internally but expose a 512-byte logical block
size to the host for compatibility reasons. The physblk_exp
parameter can be used to indicate that the internal block size is 2^n
times bigger than the reported logical block size. For instance:
Supplying physblk_exp=3
on the command line will cause scsi_debug to simulate a device with
512-byte logical blocks and 4KB physical blocks.
Not all
storage devices have logical block 0 aligned to a physical block
boundary. These devices can be emulated using scsi_debug's
lowest_aligned option. The parameter
indicates the lowest LBA that is aligned to a physical block
boundary.
SBC-3 introduced Logical block provisioning. That term covers both
"thin provisioning" (the earlier term for this facility)
and "over provisioning" as used in modern SSDs.
Thin
provisioning means that devices can report a capacity that is bigger
than the space actually allocated. When files are deleted, the
relevant blocks can be reclaimed by the storage device and used for
something else. And consequently only blocks that are actively in use
consume physical storage space.
SBC-3 specifies two
different approaches for marking blocks as unused: WRITE SAME(16)
with the UNMAP bit set, and the UNMAP command. scsi_debug supports
both methods and they are controlled via 4 module parameters:
unmap_max_desc specifies the maximum number of ranges that can be unmapped using a single UNMAP command. If this is set to 0, only WRITE SAME is supported and UNMAP will cause a check condition.
unmap_granularity specifies the granularity at which to track mapped blocks (specified in number of logical blocks). 2048 (1 MB) is a realistic value for disk arrays although some may have a finer granularity.
unmap_alignment specifies the first LBA which is naturally aligned on an unmap_granularity boundary.
unmap_max_blocks specifies the maximum number of blocks that can be unmapped using a single UNMAP command. Default is 0xffffffff.
Examples:
modprobe scsi_debug lbpws=1 unmap_max_desc=0 unmap_granularity=1
will simulate a device that only supports WRITE SAME(16) and which tracks usage on a per logical block basis. This is how most solid state drives work.
modprobe scsi_debug lbpu=1 unmap_max_desc=64 unmap_granularity=2048
will simulate a device that supports UNMAP and which is
provisioned in 1MB chunks. This is a common scenario for thinly
provisioned storage arrays.
The current block allocation
bitmap can be viewed from user space via:
cat /sys/bus/pseudo/drivers/scsi_debug/map
An important feature of the SCSI command sets is the concept of a
Unit Attention (UA). This is a mechanism for the "device server"
within a logical unit (e.g. a disk) to report to the originator (e.g.
a user space program, a file system or the kernel) that something,
not directly related to the command that was just sent, has happened.
That report takes the form of the command not being done and
sense data with the UNIT ATTENTION sense key being returned.
Additional information about the UA is provided in the sense data and
the originator is expected to take note. UAs are typically only
reported once so if the initiator repeats the command it should work
(or a different type of UA might be delivered).
An example
might make this clearer. It is possible to change the number of
logical blocks on a disk; the FORMAT command could do that. In the
scsi_debug driver even though dev_size_mb
cannot be changed at run time, the virtual_gb
parameter can be. If the the virtual_gb
parameter is changed (via sysfs, after the driver has been running),
then the "Capacity data has changed" UA condition is set.
The next command sent to that device will receive that UA (with some
exceptions) in the returned sense data (and the command is not done).
The exceptions are the INQUIRY, REPORT LUNS and REQUEST SENSE
commands which skip UA reporting (see SAM-5 for details). Once the
originator sees that UNIT ATTENTION sense key, it should note the
reason, and repeat the command unless it is directly impacted. If the
command that got "hit" by this UA was a READ or a WRITE
then the originator might want to do a READ CAPACITY command first,
at least to check that the LBA given to the READ or WRITE command was
still in range.
The scsi_debug driver reports these Unit
Attentions:
Power on, reset, or bus device reset occurred
SCSI bus reset occurred
Mode parameters changed
Capacity data has changed
If there is more than one UA, then they are reported in the ascending order of that list.
In version 1.89 of this driver support was added for "host-managed" Zone Block devices that comply with the "sequential write required" model. All scsi_debug devices generated when this module is loaded with the "zbc=host-managed" parameter will be of this type, which has a ptype value of 0x14 (i.e. their SCSI Peripheral Device Type (pdt)).
Since scsi_debug is for testing it seems more useful to build it
as a module rather than build it into the kernel. Some parameters
cannot be changed once the scsi_debug driver is running. So if it is
a module then it can be removed with rmmod and reloaded with
another modprobe call with the desired parameters.
When
the driver is loaded successfully simulated disks should be visible
just like other SCSI devices:
# modprobe scsi_debug # lsscsi -s [0:0:0:0] disk SEAGATE ST33000650SS 0005 /dev/sda 3.00TB [0:0:1:0] enclosu Intel RES2SV240 0d00 - - [4:0:0:0] disk ATA ST3160812AS D /dev/sdb 160GB [7:0:0:0] disk Linux scsi_debug 0184 /dev/sdc 8.38MB
In this case there is a 3 TB SAS disk, an ATA disk and a small scsi_debug pseudo disk. The other device (at [0:0:1:0]) is a SCSI Enclosure Service (SES) device. The /dev/sdc pseudo disk is full of zeros and has no partitions. To get a partition the num_parts parameter could have been used on the modprobe line or it could be done from the command line with the fdisk /dev/sdc command. Assuming one ext3 partition is allocated to the whole pseudo disk (8 MB in this case) then the mkfs.ext3 /dev/sdc1 command can be used to make an ext3 file system. Now /dev/sdc1 can be mounted and treated like a normal file system. Naturally when the power is turned off anything stored in /dev/sdc1 will be forgotten.
Rather than mounting the pseudo disk, the sg3_utils package could be used to carry out various tests on it.
Information about the scsi_debug driver version, its current parameters and some other data can be found in the "proc" file system. The trailing number in the path is the scsi_debug host number which is the first element in the 4 item tuple shown in the lsscsi above :
# cat /proc/scsi/scsi_debug/3 scsi_debug adapter driver, version 0189 [20200225] num_tgts=1, shared (ram) size=1024 MB, opts=0x0, every_nth=0 delay=-9999, ndelay=100000, max_luns=10, sector_size=512 bytes cylinders=130, heads=255, sectors=63, command aborts=0 RESETs: device=0, target=0, bus=0, host=0 dix_reads=0, dix_writes=0, dif_errors=0 usec_in_jiffy=1000, statistics=0 cmnd_count=0, completions=0, miss_cpus=0, a_tsf=0 submit_queues=1 queue 0:
Here is an important sysfs directory for the scsi_debug driver:
# cd /sys/bus/pseudo/drivers/scsi_debug/ # ls -x adapter0 add_host ato bind cdb_len delay dev_size_mb dif dix doublestore dsense every_nth fake_rw guard host_lock map max_luns max_queue ndelay no_lun_0 no_uld num_parts num_tgts opts ptype random removable scsi_level sector_size statistics strict submit_queues uevent unbind uuid_ctl virtual_gb vpd_use_hostno zbc
Those files are most of the scsi_debug parameters, those that are writable can be modified and the scsi_debug actions will change accordingly thereafter. Certain parameters cannot be changed while the driver is busy (e.g. it has queued command responses), in which case EBUSY is returned if the user attempts to change one. Reading one can be done with the cat command and changing one can be done with the echo command:
# cat every_nth 0 # echo 2000 > every_nth Another important sysfs directory for (any) disks is /sys/block/<disk_node_name> and its queue sub-directory. So in this case of this scsi_debug pseudo disk that directory would be /sys/block/sdc/queue . Also there is the scsi_device sysfs directory that has the form /sys/class/scsi_device/<h:c:t:l>/device where the <h:c:t:l> tuple is found at the left hand side of each device listed by lsscsi. This sysfs directory contains many important SCSI device parameters some of which can be modified.
Individual devices can be removed via sysfs and the mid-level by writing any value into the "delete" member in the sysfs directory corresponding to the scsi device. Given these devices: # lsscsi -s [0:0:0:0] disk SEAGATE ST200FM0073 0A04 /dev/sda 200GB [4:0:0:0] disk ATA ST3160812AS D /dev/sdb 160GB [7:0:0:0] disk Linux scsi_debug 0184 /dev/sdc 21.4GB
then the scsi_debug (pseudo) disk can be deleted like this:
# echo 1 > /sys/class/scsi_device/7:0:0:0/device/delete
After which this should be seen:
# lsscsi -s [0:0:0:0] disk SEAGATE ST200FM0073 0A04 /dev/sda 200GB [4:0:0:0] disk ATA ST3160812AS D /dev/sdb 160GB
This will work for any scsi device (not just those belonging to scsi_debug). That scsi device can be re-added with the following command:
# echo "0 0 0" > /sys/class/scsi_host/host7/scan # lsscsi [0:0:0:0] disk SEAGATE ST200FM0073 0A04 /dev/sda [4:0:0:0] disk ATA ST3160812AS D /dev/sdb [7:0:0:0] disk Linux scsi_debug 0184 /dev/sdc
The three numbers in the "echo" are channel number, target number and lun, respectively. Wildcards (hyphen: "-") can be given for any or all of the three numbers.
# echo 3 > /sys/bus/pseudo/drivers/scsi_debug/max_luns # echo 2 > /sys/bus/pseudo/drivers/scsi_debug/num_tgts # echo "0 - -" > /sys/class/scsi_host/host7/scan # lsscsi [0:0:0:0] disk SEAGATE ST200FM0073 0A04 /dev/sda [4:0:0:0] disk ATA ST3160812AS D /dev/sdb [7:0:0:0] disk Linux scsi_debug 0184 /dev/sdc [7:0:0:1] disk Linux scsi_debug 0184 /dev/sdd [7:0:0:2] disk Linux scsi_debug 0184 /dev/sde [7:0:1:0] disk Linux scsi_debug 0184 /dev/sdf [7:0:1:1] disk Linux scsi_debug 0184 /dev/sdg [7:0:1:2] disk Linux scsi_debug 0184 /dev/sdh
The 'echo "0 - -" > scan' line above added five devices: /dev/sdd to /dev/sdh .
Extra hosts can be added and removed from the scsi_debug driver as follows:
# cd /sys/bus/pseudo/drivers/scsi_debug # echo 1 > add_host # add a new host (after the existing hosts) # echo -2 > add_host # remove the last two hosts (if at least that many are present)
The scsi_debug driver does not have any limits on the number of scsi devices it can create. By default when loaded it has one scsi device (owned by a host). Larger numbers of devices can be introduced at load time by specifying the add_host, num_tgts and/or max_luns parameters, the number of scsi devices created is the product of the 3 parameters (they all default to 1). Alternatively sysfs can be used to add (or remove) scsi devices after the scsi_debug driver is loaded. Two strategies can be used:
increase the value of num_tgts or max_luns then use a line like 'echo "0 - -" > scan' (shown above) to a host already owned by the scsi_debug driver.
add more hosts with a line like 'echo 3 > add_host'. Each new host will create (num_tgts * max_luns) new scsi devices. Of course num_tgts or max_luns can be modified prior to calling 'echo 3 > add_host'.
Even though the scsi_debug can create ten thousand or more devices, it doesn't mean that the scsi mid-level, sd, sg, the block layer and various other kernel components will handle it gracefully.
The supported mode pages are listed following the MODE SENSE entry in the supported commands sections above. Prior to version 1.80, when a mode page is read no block descriptor is included in the response. From version 1.78 the MODE SELECT command is supported. Three mode pages can be modified:
caching (WCE field is changeable) [added in version 1.84]
control (D_SENSE field is acted upon)
informational exceptions control (MRIE and TEST fields are acted upon by REQUEST SENSE)
The saved pages are not supported, reflecting that the scsi_debug driver has only volatile storage. All fields can be changed, only those fields indicated above have side effects.
Various users have asked for each scsi_debug device (i.e. "Logical Unit" (LU) in SCSI parlance) to have its own ram or backing store rather than all devices sharing the same ram. The answer has been: "look at tcm_loop" because this driver has been built to simulate thousands of hosts, targets and devices (LUs) without consuming the sort of resources that would usually imply.
That said, having only one ram image is a bit limiting. Having two ram images, shared between all the scsi_debug LUs (where the number of LUs is assumed to be greater than one), allows the correctness of copies to be checked. The SCSI VERIFY command simulation in version 1.89 of this driver has been "beefed up" from doing nothing and always returning a GOOD status, to doing a proper comparison of the data-out buffer against the ram disk when the BYTCHK field is set to 1. If that comparison fails, it stop and returns a CHECK CONDITION status, with a sense key of MISCOMPARE.
The following first loads the scsi_debug module with 4 LUs that share two slabs of 1 GiB, so /dev/sda and /dev/sdc share one, and /dev/sdb and /dev/sdd share the other. The logical block size defaults to 512 bytes and the ndelay=100000 random=1 means that the delay on each media command is uniformly distributed between a mzximum of 100 microseconds and 0 microseconds. dd is not too smart trying to write random data off the end of /dev/sdc but it does the job. Then there is a copy of the data from /dev/sg2 (aka /dev/sdc) to /dev/sg3 . Finally the --verify on the sg_dd utility (found in version 1.45 and later of the sg3_utils package) compares /dev/sg2 and /dev/sg3 using the SCSI VERIFY(BYTCHK=1) command.
# modprobe scsi_debug max_luns=4 dev_size_mb=1024 ndelay=100000 random=1 doublestore=1 # lsscsi -gs [0:0:0:0] disk Linux scsi_debug 0189 /dev/sda /dev/sg0 1.07GB [0:0:0:1] disk Linux scsi_debug 0189 /dev/sdb /dev/sg1 1.07GB [0:0:0:2] disk Linux scsi_debug 0189 /dev/sdc /dev/sg2 1.07GB [0:0:0:3] disk Linux scsi_debug 0189 /dev/sdd /dev/sg3 1.07GB [N:0:1:1] disk INTEL SSDPEKKF256G7L__1 /dev/nvme0n1 - 256GB # dd if=/dev/urandom of=/dev/sdc dd: writing to '/dev/sdc': No space left on device 2097153+0 records in 2097152+0 records out 1073741824 bytes (1.1 GB, 1.0 GiB) copied, 49.4036 s, 21.7 MB/s # sg_dd if=/dev/sg2 of=/dev/sg3 bs=512 2097152+0 records in 2097152+0 records out # sg_dd --verify if=/dev/sg2 of=/dev/sg3 bs=512 2097152+0 records in 2097152+0 records verified root@xtwo70:~#
yyyyyyy
There are two techniques for discovering the luns that a SCSI target supports. The first (and oldest) is based sending commands like INQUIRY and REPORT LUNS to lun 0, even if the target has no lun 0. The second technique is based on one of the so-called "well known logical units", specifically the REPORT LUNS well known logical unit. If present it must support the INQUIRY, REPORT LUNS, REQUEST SENSE and TEST UNIT READY command. Simulating one with scsi_debug is somewhat contorted: # modprobe scsi_debug no_lun_0=1 max_luns=2 # # lsscsi -g [0:0:0:0] disk ATA INTEL SSDSC2BW18 DC32 /dev/sda /dev/sg0 [3:0:0:1] disk Linux scsi_debug 0184 /dev/sdb /dev/sg1 # # lsscsi --hosts [0] ahci [1] ahci [2] ahci [3] scsi_debug # # ## Pick the host number corresponding to scsi_debug (i.e. "3") # # cd /sys/class/scsi_host/host3 # echo "- - 49409" > scan # # lsscsi -g [0:0:0:0] disk ATA INTEL SSDSC2BW18 DC32 /dev/sda /dev/sg0 [3:0:0:1] disk Linux scsi_debug 0184 /dev/sdb /dev/sg1 [3:0:0:49409]wlun Linux scsi_debug 0184 - /dev/sg2
The scsi_debug driver needed to be told that it had no_lun_0 so it started generating luns at 1 ([3:0:0:1]) and then the scsi sub-system needed to be told to scan specifically for lun 49409 (0xc101). Thereafter the REPORT LUNS wlun appeared.
The way a SCSI initiator (host) scans for targets is transport specific. In the case of the scsi_debug driver it has a magic transport (bus) called "pseudo" which does the right thing. Apart from target discovery, the scsi_debug driver tries to simulate SAS devices, see the next section.
The scsi_debug driver has a Serial Attached SCSI (SAS)
personality. For any application that cares, it looks like a dual
ported SAS disk accessed via the primary port (relative target port
1). In one case it masquerades as a SATA disk behind a SCSI to ATA
Translation (SAT) layer (SATL). Many of the settings are in common
with Fibre Channel dual ported disks.
The driver sets the
MULTIP (multiport) bit in the INQUIRY response. The following VPD
pages are SAS or SAT specific:
device identification page [0x83] (yields naa-5 addresses for the lu, the accessing target port and the target device, plus some other designators)
SCSI ports [0x88] (shows the naa-5 addresses of both ports)
ATA information [0x89] (simulates a SATA disk in a SAS domain, defined in SAT)
The naa-5 addresses are meant to be world wide unique names which
represents a challenge to the scsi_debug driver. Amongst other things
Linux does not have a IEEE company id [memo: OSDL]. Even if it did,
making them truly unique in a virtual driver, especially if multiple
boxes could somehow see each other, would be difficult.
There
are also several SAS specific mode pages:
protocol specific port page (SAS): short format page [0x19,0x0]
protocol specific port page (SAS): phy control and discover subpage [0x19,0x1]
protocol specific port page (SAS): shared mode subpage [0x19,0x2] (sas2 version)
Both the VPD and mode pages can be viewed from the user space with an application like sdparm . Below is an example of the device identification VPD page: # sdparm -i /dev/sda /dev/sda: Linux scsi_debug 0004 Device identification VPD page: Addressed logical unit: desig_type: T10 vendor identification, code_set: ASCII vendor id: Linux vendor specific: scsi_debug 2000 desig_type: NAA, code_set: Binary 0x53333330000007d0 Target port: desig_type: Relative target port, code_set: Binary transport: Serial Attached SCSI (SAS) Relative target port: 0x1 desig_type: NAA, code_set: Binary transport: Serial Attached SCSI (SAS) 0x52222220000007ce Target device that contains addressed lu: desig_type: NAA, code_set: Binary transport: Serial Attached SCSI (SAS) 0x52222220000007cd desig_type: SCSI name string, code_set: UTF-8 transport: Serial Attached SCSI (SAS) SCSI name string: naa.52222220000007CD
Below is an example of the SCSI ports VPD page showing a dual ported target:
# sdparm -i -p sp /dev/sda /dev/sda: Linux scsi_debug 0004 SCSI Ports VPD page: Relative port=1 Target port descriptor(s): desig_type: NAA, code_set: Binary transport: Serial Attached SCSI (SAS) 0x52222220000007ce Relative port=2 Target port descriptor(s): desig_type: NAA, code_set: Binary transport: Serial Attached SCSI (SAS) 0x52222220000007cf
Notice that the above implies that the INQUIRY was sent via port 1 (port A) of the emulated SAS dual ported target. The protocol specific port phy control and discover mode subpage [0x19,0x1] has target port/phy SAS addresses that correspond to the SCSI ports VPD page:
# sdparm -t sas -p pcd -l /dev/sda /dev/sda: Linux scsi_debug 0004 Direct access device specific parameters: WP=0 DPOFUA=0 port: phy control and discover (SAS) mode page: PPID_1 6 [cha: n, def: 6] Port's (transport) protocol identifier NOP 2 [cha: n, def: 2] Number of phys PHID 0 [cha: n, def: 0] Phy identifier ADT 1 [cha: n, def: 1] Attached device type NPLR 9 [cha: n, def: 9] Negotiated physical link rate ASIP 1 [cha: n, def: 1] Attached SSP initiator port ATIP 0 [cha: n, def: 0] Attached STP initiator port AMIP 0 [cha: n, def: 0] Attached SMP initiator port ASTP 0 [cha: n, def: 0] Attached SSP target port ATTP 0 [cha: n, def: 0] Attached STP target port AMTP 0 [cha: n, def: 0] Attached SMP target port SASA 0x52222220000007ce [cha: n, def:0x52222220000007ce] SAS address ASASA 0x5111111000000001 [cha: n, def:0x5111111000000001] Attached SAS address APHID 2 [cha: n, def: 2] Attached phy identifier PMILR 8 [cha: n, def: 8] Programmed minimum link rate HMILR 8 [cha: n, def: 8] Hardware minimum link rate PMALR 9 [cha: n, def: 9] Programmed maximum link rate HMALR 9 [cha: n, def: 9] Hardware maximum link rate 2_PHID 1 [cha: n, def: 1] Phy identifier 2_ADT 1 [cha: n, def: 1] Attached device type 2_NPLR 9 [cha: n, def: 9] Negotiated physical link rate 2_ASIP 1 [cha: n, def: 1] Attached SSP initiator port 2_ATIP 0 [cha: n, def: 0] Attached STP initiator port 2_AMIP 0 [cha: n, def: 0] Attached SMP initiator port 2_ASTP 0 [cha: n, def: 0] Attached SSP target port 2_ATTP 0 [cha: n, def: 0] Attached STP target port 2_AMTP 0 [cha: n, def: 0] Attached SMP target port 2_SASA 0x52222220000007cf [cha: n, def:0x52222220000007cf] SAS address 2_ASASA 0x5111111000000001 [cha: n, def:0x5111111000000001] Attached SAS address 2_APHID 3 [cha: n, def: 3] Attached phy identifier 2_PMILR 8 [cha: n, def: 8] Programmed minimum link rate 2_HMILR 8 [cha: n, def: 8] Hardware minimum link rate 2_PMALR 9 [cha: n, def: 9] Programmed maximum link rate 2_HMALR 9 [cha: n, def: 9] Hardware maximum link rate
Other supported mode pages can be accessed in a similar way by the sdparm utility. Note that transport specific mode pages need the transport identified: hence the '-t sas' option above.
There is nothing to download, see <linux_kernel_source>/drivers/scsi/scsi_debug.c .
Hopefully the design of the scsi_debug driver lends itself to many
extensions. If you think that you have a useful extension that others
may be interested in, please contact the linux-scsi list or the
author with a patch.
Back to main page
Douglas Gilbert <dgilbert at interlog dot com>
with
additions from
Martin K. Petersen <martin dot petersen at
oracle dot com>
Last updated: 25th March 2020