The ddpt utility
The ddpt utility is a variant of the standard Unix command
dd which copies files. The ddpt utility specializes in
files that are block devices. For block devices that understand the
SCSI command set, finer grain control over the copy may be available
via a SCSI pass-through interface. ddpt
has been developed for Linux and ported to FreeBSD, Solaris and
Windows.
The types of block devices that are supported are
disks (known as direct access devices in SCSI) and cd/dvd/bd devices.
It is becoming more common for ATA disks (especially SATA) to be
accessed by an operating system using SCSI commands. ATA disks are
not always directly connected and transports such as USB, IEEE1394
(FireWire) and iSCSI use SCSI commands. Protocol translation from
SCSI to ATA ("SAT", first standardized in 2007) has
appeared in OSes and external devices (e.g. recent USB disk
enclosures) and many implementations are mature enough for ddpt to
use. Data can also be copied to and from NVMe disks.
The
ddpt utility is a more generic version of the Linux specific
sg_dd and sg_xcopy utilities found in the
sg3_utils package. Tape drives are only
supported in Linux; and only via the mtio interface associated with
st devices.
The ddpt utility supports two types of
offloaded copy. They are referred to as "xcopy" and "odx".
See the see ddpt_xcopy_odx page for
more information.
This page outlines the features of the
ddpt utility version 0.97 . The ddpt utility is
found in the package of the same name.
Some features found in ddpt which are not present in the (GNU) dd implementation:
odx offloaded copy (SCSI EXTENDED COPY(LID4) subset)
xcopy (SCSI EXTENDED COPY(LID1)) to offloaded copies via pt interface
support for SCSI protection information (DIF)
introduce bpt=BPT (blocks per transfer) so bs=BS is kept as logical block size
generalize skip=SKIP and seek=SEEK to accept scatter gather lists (sgls)
bandwidth limiting via delays between each copy segment
pre-allocate output file prior to copying to it; to reduce fragmentation
write sparing (i.e. don't write buffer if already same as destination)
resume (after the copy has been interrupted)
trim on output of copy, self trim (pt interface only)
send output to a second file (see 'of2=' option)
when the --verify option is given IFILE is read and that data is sent (but not written) to OFILE in a SCSI VERIFY( BYTCHK=1) command.
the --prefetch option may accompany --verify: a SCSI PRE-FETCH(IMMED) on OFILE precedes the SCSI READ on IFILE
put options in a job file, to save retyping on each invocation
access devices directly via pass-through (pt) interface, bypassing the kernel block layer
NVMe disks can be accessed either as block devices or via a SNTL via the pass-through (pt) interface (Linux only)
accept numeric command line arguments in hexadecimal
explicit controls over how much data is read into the copy buffer and then written to output (separate from the logical block sizes of any device involved)
Sparse writing is an important ddpt feature that is now found in
recent versions of the GNU dd implementation. Several dd defaults
have been changed, usually in an effort to ease ddpt copying, or
simply reading, large amounts of data. Also by default dd attempts to
truncate its output file prior to the copy while ddpt defaults to
overwriting the output file.
For example with dd if the
'of=' option is not given, large amounts of data (often binary) will
be sent to the console (stdout) making it difficult for the
inexperienced user to understand what has happened. With ddpt if the
'of=' option is not given then nothing is output (equivalent to
outputting it to /dev/null) effectively
making the invocation a read rather than a copy.
The basic syntax of the ddpt utility is the same as the dd command
in Unix. That said, the syntax of the dd command in Unix is different
from almost all other Unix commands. Those familiar with the dd
command should not be too surprised by the syntax and semantics of
this utility. For those unfamiliar, special care should be taken,
especially with the 'of=' and 'seek=' options, both with dd and ddpt.
Wikipedia has an informative page with examples on the Unix
dd command.
It is not that important but the document
will use the term 'operand' to refer to the <name>=<argument>
construct and use the term option to refer to command line elements
starting with '-' or '--'.
There are multiple definitions and implementations of dd. The simplest current definition is from POSIX.2008 (aka SUSv4). The GNU version of dd is probably the most implemented and adds the 'iflag=' and 'oflag=' options. FreeBSD has its own implementation which does not have 'iflag=' and 'oflag=' options but adds 'conv=sparse'. The recent GNU implementation is used as a reference point. The fundamental options of dd are:
Basic dd operands and option |
dd default |
ddpt default |
Brief description |
bs=BS |
IBS, OBS or 512 |
IBS, OBS or 512 |
Number of bytes in each input and output block. Sets IBS and OBS. |
count=COUNT |
blocks in IFILE |
blocks in IFILE |
Number of input blocks to copy. |
if=IFILE |
stdin |
[none] |
file (or device) to read from. |
of=OFILE |
stdout |
/dev/null |
file (or device) to write to. |
Table 1 Fundamental dd options
When either dd or ddpt are given these options with suitable
arguments, they will copy (IBS * COUNT) bytes from the
beginning of IFILE to the beginning of OFILE. Note the
different defaults for 'if=' and 'of=' between dd and ddpt; while
defaulting to stdin and stdout may be more in keeping with a Unix
filter type command, in practice the filter syntax is not used much
for ddpt. The author feels no default for 'if=' and /dev/null
(or the Windows equivalent: NUL) for
'of=' are more useful and safer.
ddpt differs from dd as
follows. An IFILE of "-" is interpreted as stdin; an
OFILE of "-" is interpreted as stdout while an OFILE
of "." is interpreted as /dev/null.
[dd interprets input and output file names of "-"
literally; dd interprets an output file of "." as the
current directory and will not accept it.] By default the ddpt
utility does not truncate the OFILE before starting the copy
(the dd command does if it is a regular file). ddpt has a
'oflag=trunc' option (or 'conv=trunc' option) that will truncate the
OFILE before starting the copy. For output block devices
(including those accessed via the pt interface) ddpt writes integral
multiples of OBS bytes to the OFILE so it does not do
partial writes, ignoring them in the case of the last copy segment.
For regular output files (including fifos and stdout) ddpt can do
partial writes (e.g. the last write is not a multiple of OBS
bytes) to OFILE. Note that since regular OFILEs are not
truncated by default the length of OFILE may end up larger
than the length of IFILE.
If the 'count=' option is
not given then an attempt is made to determine the remaining blocks
in the file, device or partition. If the input file is stdin and no
count is given then a copy will continue until an EOF is detected on
the input stream (or something else goes wrong). If the 'count='
option is not given then the remaining blocks on both the input and
output files are determined (if possible) and if both are found then
the minimum of the two counts is used. The 'skip=' option for
IFILE and the 'seek=' option for OFILE are taken into
account when calculating the remaining number of blocks in a file,
device or partition.
If the 'count=' option is given then
no further checks regarding the remaining length of IFILE and
OFILE are done and the ddpt will attempt to copy that number
of blocks. The 'count=0' option is valid and all the normal
preparations are made including opening files but no copy takes
place. Hence the 'count=0' option can be used to check that the
syntax is in order and that the files are present (see the "Verbose"
section below).
Other dd options also supported by ddpt:
dd operands |
Brief description |
cbs=CBS |
ddpt accepts but ignores this dd option ("conversion block size") |
conv=CONV |
see section on Conversions below |
ibs=IBS |
number of bytes in each block of IFILE (default: 512) |
iflag=FLAGS |
similar to option found in recent GNU dd versions, see below |
obs=OBS |
number of bytes in each block of OFILE (default: 512) |
oflag=FLAGS |
similar to option found in recent GNU dd versions, see below |
seek=SEEK |
block number (LBA) in OFILE to commence writing (default: 0). In ddpt (but not dd) it may also be a scatter gather list may up of starting_LBA,number_of_blocks pairs (default: 0,0). |
skip=SKIP |
block number (LBA) in IFILE to commence reading (default: 0). In ddpt (but not dd) it may also be a scatter gather list may up of starting_LBA,number_of_blocks pairs (default: 0,0). |
status=STAT |
accepts 'noxfer' to suppress timing and throughput information or 'none' to suppress all trailing reports (apart from errors). From version 0.96 it also accepts 'progress' for progress reports every 2 minutes. 'progress,progress' generates a progress report every minute. When used 3 times that is shortened to 30 seconds. |
dd options |
|
--help -h |
print usage message then exit. '-h' option is equivalent. |
--version -V |
print version number and release date then exit. '-V' option is equivalent. |
Table 2 Other dd options also supported by ddpt
If the 'bs=BS' option is given then both IBS
and OBS are set to BS. If the 'bs=BS' option is
given then the presence of either 'ibs=IBS' or 'obs=OBS'
option is a syntax error. If both 'ibs=IBS' and 'obs=OBS'
are given and differ then (IBS * BPT) must be divisible
by OBS, without any remainder. [BPT is input "blocks
per transfer" and is explained below.] For example, if a
disk with 512 byte blocks (hence 'ibs=512') is being copied to
another disk with 4096 byte blocks (hence 'obs=4096') then the BPT
value should be 8 (or a multiple of 8). So in this case the BPT
default of 128 is acceptable.
Modern storage is typically addressed in terms of a Logical Block
Address (LBA) which starts at 0 for the first logical block and
finishes with a LBA that is one less than the device size (measured
in logical blocks, typically 512 or 4096 bytes long each). Prior to
LBAs various schemes such as "Cylinder-head-sector" were
used (see Wikipedia) that reflected the physical architecture of
"hard" disks at the time.
The 'skip=SKIP'
option, for SKIP greater than 0, requires IFILE to be
seek-able or at least not give an error when the file pointer is
moved (e.g. using the lseek() system call on /dev/zero
doesn't cause an error in Unix). The 'seek=SEEK' option, for
SEEK greater than 0, requires OFILE to be seek-able or
at least not give an error when the file pointer is moved (e.g. using
the lseek() system call on /dev/null
doesn't cause an error in Unix). ddpt does not do dummy reads, as dd
does, if an attempt to move a file pointer fails.
All
numeric arguments can take a multiplier suffix. These multiplier
suffixes are the same as those of GNU's dd (posted 2001-12-18):
Multiplier |
Meaning multiply associated number by |
x<n> |
<n> [e.g. '2x512' yields 1024] |
c |
1 |
w |
2 |
b |
512 |
k K KiB |
1024 |
KB |
1000 |
m M MiB |
1048576 |
MB |
1000000 |
g G GiB |
2**30 |
GB |
10**9 |
t T TiB |
2**40 |
TB |
10**12 |
Table 3 Multiplier suffixes for numeric arguments
The pattern that starts with "k" and proceeds to
"m", "g" and "t" then to "p",
"e", "z" and "y" (not shown in
above table). ddpt only implements as far as "p" (10**15 or
2**50). ddpt only allows multipliers based on "t" and "p"
for COUNT, SKIP and SEEK.
ddpt allows
numeric arguments to be given in hexadecimal in which case they can
be prefixed by either "0x" or "0X". A numeric
argument cannot both be in hex and have a suffix multiplier. Hence
"0x9" is interpreted as hexadecimal 9 [not (0 * 9)==0].
This string is valid: "2x4x0xa" and yields 80 (but it isn't
very clear).
Hexadecimal numbers can also be indicated by
a trailing "h" or "H". The "h" suffix
cannot be used together with a suffix multiplier.
If a
SIGUSR1 signal is sent to the process identifier (pid) of a running
ddpt utility then the number of blocks copied to that point is
output. The copy continues.
Unless the 'status=noxfer'
option is given, the elapsed time for the copy plus the throughput
measured in megabytes (10**6 bytes) per second is output when
the copy is complete (or an error stops the copy). If a SIGUSR1
signal is sent to the process identifier (pid) of a running ddpt
utility then the elapsed time and the throughput of the copy to that
point is output and the copy continues.
The extra options of ddpt (not found in GNU's dd) are:
extra operands in ddpt |
default |
Brief description |
bpt=BPT |
varies, |
Blocks Per Transfer (BPT) is the number of input blocks per transfer (granularity of each IO) read into the copy buffer. Default varies between 8192 and 1 depending on IBS. If BPT is given as zero, it is changed to the default value. See below this table. |
bpt=BPT,OBPC |
OBPC=0 |
Output Blocks Per Check (OBPC) controls the granularity of sparse write, write sparing and trim checks. Default (0) is equivalent to OBPC=(BPT*IBS)/OBS. If the given OBPC exceeds (BPT*IBS)/OBS then it is scaled back to that value. |
cdbsz=6 | 10 | 12 | 16 | 32 |
10 or 16 |
cdb size of SCSI READ and/or WRITE commands. Only applicable to pt devices. Defaults to 10 byte cdb unless the largest address exceeds 32 bits or BPT exceeds 16 bits. In either case a 16 cdb is used. Two values can be given, separated by a comma; if so the first value is for IFILE, the second value is for OFILE. |
cdl=CDL |
0 |
command duration limits. Either one or two, comma separated, values where 0 means no command duration limits. Values 0 to 7 are permitted and map to 3 bit fields in the SCSI READ(16,32) and WRITE(16,32) commands. If one value is given, it applies to both IFILE and OFILE. If two values are given, the first applies to IFILE and the second applies to OFILE. Command duration limits can be accessed and change via mode pages. See the sdparm utility. |
coe=0 | 1 |
0 |
when non-zero, continue_on_error. May use iflag=coe and/or oflag=coe instead. See section on continue on error. |
coe_limit=CL |
0 |
number of consecutive "bad" block errors allowed when reading and 'coe > 0'. Default of 0 is interpreted as no limit. See section on continue on error. |
ddpt=VERS |
|
causes a syntax error if the ddpt executing the command line (or job file) version number is less than VERS. This operand was introduced in version '0.96'. If VERS starts with 'r' then the check is based on the subversion revision number, the current version number of the ddpt package is 358. |
delay=MS,W_MS |
0,0 |
delay (sleep) after each copy segment (typically (BPT*IBS) bytes) by MS milliseconds. 0 implies no delay. Actual write operations may be delayed by W_MS milliseconds |
id_usage=LIU |
0 or 2 |
xcopy: set list_id_usage to hold (0), discard (2) or disable (3) |
intio=0 | 1 |
0 |
allow read, write and pass-through calls to be interrupted by signals. Default is 0 which means during those calls block SIGINT, SIGPIPE and SIGUSR1(SIGINFO) signals. |
iseek=SKIP |
0 |
same as skip=SKIP. From FreeBSD's dd command |
ito=ITO |
0 |
odx: inactivity timeout in seconds (0 --> TPC VPD page's default) |
list_id=LID |
1, 0 or 257 |
xcopy: list_identifier, a value from 0 to 255 oflag=wstream: LID is used as the stream identifier (16 bit value, default: 0) |
of2=OFILE2 |
/dev/null |
second output file. Cannot be pt device. |
oseek=SEEK |
0 |
same as seek=SEEK |
prio=PRIO |
1 |
xcopy: value for priority field |
protect=RDP,WRP |
0,0 |
Set the RDPROTECT field in SCSI READs, and the WRPROTECT field in SCSI WRITEs. |
retries=RETR |
0 |
number of times to retry an error on a pt device READ or WRITE command |
rtf=RTF |
|
odx: ROD Token file, see ddpt_xcopy_odx |
rtype=RTYPE |
0 |
odx: ROD type, see ddpt_xcopy_odx |
to=TO |
0 |
odx,xcopy: command timeout in seconds (0 --> 600 seconds) |
verbose=VERB |
0 |
the larger VERB is then the greater the debug output. 1 and 2 print the cdbs for setup commands; 3 and 4 print the cdbs for all commands |
extra ddpt options |
|
|
--dry-run -d |
|
parse command line operands and options then prepare for the read/copy (e.g. by determining file and device sizes) but bypass the actual read/copy. |
--job=JF |
|
JF is a job file containing options. '#' treated as a comment lead-in |
--odx -o |
|
odx: request ODX operation, see ddpt_xcopy_odx |
--prefetch -P |
|
used in conjunction with --verify option. For each segment sends this sequence of SCSI commands: PRE-FETCH(OFILE, IMMED), READ(IFILE) and VERIFY(OFILE, BYTCHK=1) |
--progress, -p |
|
|
--quiet -q |
|
suppress 'normal' dd like output, making ddpt more like a typical Unix utility in which "no news is good news" |
--verbose -v |
|
equivalent to 'verbose=1'. If used twice equivalent to 'verbose=2'. May be shortened to '-v' or '-vv'. |
--verify -X |
|
instead of copying IFILE to OFILE, this option causes IFILE and OFILE to be compared. The comparison continues until the count is exhausted or an inequality ("miscompare") is detected. Uses the SCSI VERIFY(BYTCHK=1) command rather than the more common READ(IFILE)+READ(OFILE)+compare approach. |
--wscan -w |
|
Windows only. Lists storage devices and associated volumes then exits. Other options ignored. |
--xcopy -x |
|
use EXTENDED_COPY rather than READ,WRITE to do the copy |
ddpt command line arguments |
|
|
JF |
|
JF is a job file and its name must not start with '-' or contain a '='. JF is checked to make sure it is regular and contains ASCII characters before being parsed. |
Table 4 Extra options found in ddpt
The default values for BPT are: for IBS <
8, BPT is 8192; for IBS < 64, BPT is 1024;
for IBS < 1024, BPT is 128; for IBS <
8192, BPT is 16; for IBS < 32768, BPT is 4;
else BPT is 1.
If OFILE2 is given then it is
written to prior to the write to OFILE including processing
such as sparse writing.
The FLAGS argument of 'iflag=' and 'oflag=' is a comma separated list of items chosen from one or more entries in this table:
FLAG |
filetype: |
iflag or oflag |
comments |
00 |
|
iflag |
replaces if=IFILE with as many bytes of 0x0 as are required. Same as giving if=/dev/zero in Unix. |
append |
reg |
oflag |
use O_APPEND open flag. Conflicts with 'seek=SEEK' when
"SEEK > 0". Pointless on block device, may
cause open error |
atomic |
pt |
oflag |
use WRITE ATOMIC(16) command in place of the usual WRITE command |
block |
pt |
both |
open of pt files (devices) typically defaults to non-blocking. This flag will make the open()s blocking |
cat |
blk, pt |
both |
xcopy: set cat flag in segment descriptor header |
coe |
all |
iflag, both for pt |
See section on continue on error. |
dc |
blk, pt |
both |
xcopy: set dc flag in segment descriptor header |
direct |
blk, reg |
both |
use O_DIRECT open flag. Bypass block layer's buffering. |
dpo |
pt |
both |
"disable page out" set for READ and/or WRITE SCSI commands |
errblk |
pt |
iflag |
writes LBAs of bad blocks (medium errors) to errblk.txt file. One LBA per line, in hex, preceded by 0x. |
excl |
all |
both |
Use O_EXCL open flag |
fdatasync |
blk,reg |
oflag |
flush OFILE's data to storage at end of copy. Ignored if oflag=direct also given. |
ff |
|
iflag |
replaces if=IFILE and supplies as many bytes of 0xff as are required |
flock |
all |
both |
use advisory exclusive lock |
force |
pt |
both |
override objections and warnings from sanity checking (e.g. discrepancy between IBS or OBS and the block size in the SCSI READ CAPACITY command response) |
fsync |
blk,reg |
oflag |
flush OFILE's data and metadata to storage at end of copy. Ignored if oflag=direct also given. |
fua |
pt |
both |
"force unit access" set for READ and/or WRITE SCSI commands |
fua_nv |
pt |
both |
"force unit access non-volatile cache" set for READ and/or WRITE SCSI commands |
ignoreew |
tape |
oflag |
ignore early warning (of end of tape). |
nocache |
blk, reg |
both |
Use posix_fadvise(POSIX_FADV_DONTNEED) to suggest minimal use of file buffers (kernel cache) associated with files being copied. |
nocreate |
all |
oflag |
The default action if OFILE does not exist is to create a regular file of that name. This can give unwanted results, for example 'of=/dev/sg7' if there is no device of that name will create a regular file; and that regular file will hide device /dev/sg7 if it does get connected. [The solution is: 'rm /dev/sg7' .] With 'oflag=nocreate' an error will occur if OFILE does not already exist, and no copy (or read) takes place. |
no_del_tkn |
pt |
oflag |
odx: see ddpt_xcopy_odx |
nofm |
tape |
oflag |
Suppress writing the filemark which is normally written by the st tape driver on closing the tape file |
nopad |
tape |
oflag |
when the block to be written to a tape drive contains less than OBS bytes, then this option causes the partial block to be written as is. The default action for a tape in this case is to pad the block. |
norcap |
pt |
both |
do not perform SCSI READ CAPACITY command |
nowrite |
all |
oflag |
bypass writes to OFILE. Other commands (e.g. related to trim) are sent to OFILE. The "records out" count is not incremented. See section on trim and unmap . |
null |
all |
both |
this flag is just a place holder |
odx |
pt |
both |
odx: request ODX operation, see ddpt_xcopy_odx |
pad |
all |
oflag |
when the block to be written (typically the last block) contains less than OBS bytes, then this option causes the block to be padded with zeros. Default for tapes in to pad, the default for other file types is nopad |
prealloc |
reg |
oflag |
use fallocate() to allocate space for OFILE prior to any data being written. This reduce fragmentation of OFILE. |
prefer-rcs |
pt |
oflag |
odx: prefer RECEIVE COPY STATUS command to default RECEIVE ROD TOKEN INFORMATION command |
pt |
blk |
both |
access block device via SCSI pass-through mechanism. Has no effect on pt device. For NVMe disks a SNTL is used to translate SCSI commands to the corresponding NVMe (or NVM) commands when this flag is used [Linux only currently]. |
rarc |
pt |
iflag |
set field of that name in SCSI READ commands |
resume |
reg |
oflag |
if copy interrupted add 'resume' to oflag to restart copy |
rtf_len |
pt |
both |
odx: see ddpt_xcopy_odx |
self |
pt |
both |
specify self trim when used together with trim. Can appear as iflag or oflag but applies to OFILE which needs to be a pt device |
sparing |
all |
oflag |
don't write output buffers if reading the OFILE indicates the data compares equal. See section on write sparing . |
sparse |
all |
oflag |
don't write output buffers that are full of zeros. The last segment of a regular OFILE is written except when the sparse argument is given twice. See section on sparse writes . |
ssync |
pt |
oflag |
send SCSI SYNCHRONIZE CACHE command to OFILE after copy |
strunc |
reg |
oflag |
variant of oflag=sparse in which ftruncate() system call is used to extend OFILE if necessary |
sync |
all |
both |
use O_SYNC open flag, probably ignored on pt devices |
trim |
pt |
oflag |
similar functionality to sparse. Sends TRIM (UNMAP or WRITE SAME) command when zeros found. See section on trim and unmap . |
trunc |
reg |
oflag |
truncate the OFILE prior to starting the copy. If SEEK is not given or 0, truncate to zero length; else truncate to the length implied by SEEK. The default action of ddpt is to not truncate the OFILE (the opposite of what the dd command does). |
unmap |
pt |
oflag |
See trim |
verify |
pt |
oflag |
for pass-through output use the WRITE AND VERIFY command rather than WRITE. If ",bytchk" option given then set field of that name in command. |
wstream |
pt |
oflag |
for pass-through output use the WRITE STREAM command rather than WRITE |
xcopy |
blk, pt |
both |
if with iflag then EXTENDED COPY sent to IFILE. If with oflag then EXTENDED COPY sent to OFILE |
Table 5 Arguments to ddpt's iflag and oflag options
Recent versions of GNU's dd command have these flags with
similar semantics as ddpt: 'append', 'direct' and 'sync'.
The CONV argument of 'conv=' is a comma separated list of items chosen from one or more entries in this table:
CONV |
filetype: |
comments |
fdatasync |
blk,reg |
see fdatasync flag |
fsync |
blk,reg |
see fsync flag |
nocreate |
all |
equivalent to oflag=nocreate |
no_del_tkn |
pt |
equivalent to oflag=no_del_tkn |
noerror |
all |
IO error does not stop copy. dd's 'conv=noerror,sync' maps to ddpt's 'iflag=coe'. See the coe flag |
notrunc |
reg |
do not truncate OFILE before copy starts (default action) |
null |
all |
this conversion is just a place holder |
prefer_rcs |
pt |
equivalent to oflag=prefer_rcs |
resume |
reg |
see the resume flag |
rtf_len |
pt |
equivalent to "oflag=rtf_len" |
sparing |
all |
see the sparing flag |
sparse |
all |
see the sparse flag |
sync |
all |
this conversion is accepted and ignored. |
trunc |
reg |
see the trunc flag |
Table 6 Arguments to ddpt's conv option
The dd command in Unix has been around for a long time. In
the early days the 'conv=' option allowed things like ASCII to EBCDIC
conversions to take place as part of the copy process. ddpt does not
implement these "classical" conversions. More recently,
conversions have been added by some dd implementations (e.g.
FreeBSD's dd supports 'conv=sparse') that resemble some ddpt
features. So in some cases, conversions are accepted and mapped to
various flag arguments.
The standard skip=<starting_LBA> and equivalent seek= operand of the dd command have been generalized into scatter gather lists in version 0.96 of ddpt. For skip= a gather list is a sequence of [starting_LBA,number_of_blocks] pairs that will be "gathered" up" on the remote storage device and sent back to the host (running the ddpt command) as a linear sequence. For seek= a scatter list is a sequence of [LBA,number_of_blocks] pairs that after a linear sequence of bytes sent from the host will be "scattered" on the remote storage device. Since these two operations are closely related and reciprocal, the collective term of "scatter gather list" (sgl) is used for both.
A little sleight of hand is used here to generalise skip=<starting_LBA> to skip=<sgl> as used by ddpt. A scatter gather list element essentially replaces both the skip=<starting_LBA> and count=<number_of_blocks> operands in the standard dd utility. So the problem is how to accept a standard dd invocation using both skip=<starting_LBA> and count=<number_of_blocks> operands when given to the ddpt utility so that the end result is the same. The solution chosen was to treat skip=<starting_LBA> as a special case and expand it to the sgl element [starting_LBA, 0]. Further when the last sgl element has "0" as its numer_of_blocks then that is interpreted as "to the end of the transfer" where that count is deduced some other way. Apart from this special case (i.e. the sleight of hand) when skip= and seek= operands are given to ddpt, they must have an even number of elements following this pattern: LBA0,NUM0,LBA1,NUM1,LBA2,NUM2 ... etc. Here LBA is a shortening of starting logical block address and NUM is a shortening of number of blocks (from and including the starting page).
If a large sgl is to be input, placing it in a file may be more convenient, then skip=@sgl_filename or seek=H@sgl_in_hex_filename can be given. Following the lead of this utility and others that it shares a library with from sg3_utils, all numbers given are in decimal, unless otherwise indicated. Hex numbers can be indicated by a leading "0x" (the C/C++ language convention) or with a trailing "h" (as used in the t10.org generated standards). Scatter gather lists may have geometric properties (e.g. converting CHS addressed storage to the corresponding LBA addressed storage) which may lead to them being generated by a program and a hex representation may be more convenient. sgl files that are implicitly hexadecimal can be loaded with the seek=H@<filename> syntax. As an extra check, such a sgl file containing implicitly hex numbers must contain the word "hex" (or "HEX") before any sgl elements. This is a sanity check.
sgl elements that have 0 as their number_of_blocks are somewhat curious and termed as degenerate elements in ddpt. No data is copied to (or read from) the storage device based on a degenerate sgl element. In certain contexts they can be viewed metadata. When categorizing sgls the treatment of degenerate elements is problematic. For example should a degenerate sgl element effect the calculation of a sgl's lowest and highest LBAs? Also should a sgl element effect the classification of a sgl being montonic. Ascending montonic means the current LBA is greater than or equal to the sum of the previous element's LBA and its NUM, for all sgl element other than the first. A related definition can be used for descending monotonic scatter gather lists.
The following environment variable modify the behaviour of ddpt when they are defined in the shell tat ddpt is invoked in:
DDPT_DEF_BS
ODX_RTF_LEN
XCOPY_TO_DST
XCOPY_TO_SRC
ddpt default its BS value to 512 bytes which might become tiresome
when working with a newer 4096 block size storage environment.
Solution (in a bash shell) is to do this:
export DDPT_DEF_BS=4096
The other 3 environment variable
are xcopy/odx specific and are explained in the ddpt_xcopy_odx
page.
Job files are modelled on a similar facility in the fio
utility. Both ddpt and fio have a lot of command line options that
can become burdensome to re-enter if the utility is being executed
multiple times. So a job file is simply a file that contains options.
Options can be placed on separate lines and anything in a job file
starting with "#" to the end of that line is ignored; so
"#" can be used as a lead-in to comments. Blanks lines in a
job file are ignored.
In ddpt, a job file can be specified
as an argument to --job=JF or, more dangerously, directly on
the command line. The second variant is "more dangerous"
because it needs to distinguish itself from dd style options than
contain an equal sign (e.g. iflag=coe ) and common syntax errors. For
example:
ddpt if=/dev/zero
iflag-coe bs=512 /dev/sde3 seek=20m count=1
will
attempt to parse both iflag-coe as a job
file (probably because the user meant to type '=' rather than '-')
and /dev/sde3 as a job file (where the
user
probably meant of=/dev/sde3 ). Various
safety checks on potential job files will catch these cases (but not
all situations). Most likely a file called iflag-coe
doesn't exist, and if /dev/sde3 does exist,
it is not a "regular" file. A further check is done looking
for non-ASCII characters which should catch pure data or executable
files that ddpt is trying to interpret as a job file.
ddpt
parses job files (there may be several) when it seems them in a left
to right scan of the command line options. Depending on the option,
earlier defined options may override, clash with, or accumulate with
the same option given in a job file (or later on the command line).
For example bpt= options override one another so the last one
encountered "wins"; on the other hand if=IFILE and of=OFILE
options clash, only one of each is permitted. And options like
--verbose (or -v) accumulate. Job files themselves are parsed line by
line, from the "top" to the end of the file.
Job
files can call other job files including themselves. The depth of the
call chain is tracked and when it reaches 5, the parsing stops with
an error. This should catch infinite recursion when a job file
invokes itself.
Broadly there are three file types: regular files, block devices
and block devices accessed via a pass-through interface. In earlier
sections these are abbreviated to "reg", "blk"
and "pt" respectively. Additionally there are various
special files that may also be useful: /dev/null,
/dev/zero and /dev/random
. Then there are console input, output and error output known in Unix
as stdin, stdout
and stderr respectively. ddpt (and dd)
use stderr for a summary of blocks
moved and for warning and error messages. Both stdin
and stdout are available for command
line piping.
The ddpt utility examines the files it is
given and treats them differently depending on their file type.
Depending on iflag=FLAGS and oflag=FLAGS settings:
O_DIRECT, O_SYNC, O_APPEND, O_EXCL and O_TRUNC flags may be added to
the relevant open system call. In Unix see 'man 2 open' or 'man -s 2
open' for more information on the open system call.
File type |
open IFILE |
open OFILE |
IO method |
Notes |
regular |
O_RDONLY |
O_WRONLY | O_CREAT |
Unix read() write() |
N.B. A regular output file is overwritten (not truncated). |
stdin or stdout |
[do nothing] |
[do nothing] |
Unix read() write() |
hence open() flags have no effect (e.g. 'oflag=direct' is ignored) |
/dev/null or . (period) |
O_RDONLY |
[do nothing] |
Unix read() if input |
if output file then nothing is written |
block device |
O_RDONLY |
O_WRONLY | O_CREAT |
Unix read() write() |
Windows uses a device specific IO method |
pt device |
O_RDWR or O_RDONLY |
O_RDWR |
SCSI commands |
Opens input O_RDONLY if O_RDWR fails |
Table 6 Treatment of various file types by ddpt
Some of the above combinations are not sensible (e.g.
'oflag=append' on a block device). When either 'iflag=direct' or
'oflag=direct' is given (hence opening the corresponding file with
O_DIRECT) the internal copy buffer used is aligned to the page size.
For example the page size in the Linux i386 architecture is 4
kilobytes.
Depending on the platform when a file is known
to be associated with the pass-through interface (e.g. in Linux
/dev/sg* and /dev/bsg/*
devices) the "pt" flag is assumed. This implies that in
Linux if 'if=/dev/sg2' is specified then there is no need to add
'iflag=pt'. On the other hand if the file appears to be a block
device (e.g. in Linux /dev/sdc) then
the normal read()/write() system calls will be used unless 'iflag=pt'
(or 'oflag=pt') is given.
With block and pt devices the
operating system may impose an upper limit on the size of each IO
operation. The size that ddpt will attempt to use is IBS*BPT
bytes. If this limit is exceeded the operating system may well
respond with an EIO (input/output) error. In such cases try reducing
the BPT value.
If a partition of block device is
accessed (e.g. in Linux /dev/sda2) and
the "pt" flag is not given then logical block address 0 for
the purposes of ddpt (and its skip and seek options) is the beginning
of that partition while the calculated count (e.g. when a 'count'
option is not given) is the extent of that partition. However if a
partition of a block device is accessed (e.g. in Linux /dev/sda2)
when the "pt" flag is active then the partition is ignored
and the underlying device (i.e. /dev/sda)
is accessed. This means logical block address 0 for the purposes of
ddpt (and its skip and seek options) is the beginning of the device
(i.e. not the partition) while the calculated count (e.g. when a
'count=' option is not given) is the extent of the whole
device.
From ddpt version 0.92
some further checks are made when a block device is accessed with the
"pt" flag. If there is a discrepancy between the block
device size and the SCSI READ CAPACITY command applied via the
pass-through then the copy is aborted unless the "force"
flag is given. For example: one would expect the block file size of
'if=/dev/sda' and the READ CAPACITY
size of 'if=/dev/sda iflag=pt' to
be the same; however the block file size of 'if=/dev/sda2'
should be less than the READ CAPACITY size of 'if=/dev/sda2
iflag=pt' causing the copy to be aborted with a warning.
Often retries are of little use, especially on medium errors,
since the device has probably already done multiple retries before
the medium error is reported. However a transport error (e.g. causing
a CRC error in returned data) is not necessarily seen by the device
and a retry may quickly solve the problem. In SAS a Transport Layer
Retries (TLR) state machine is optional and requires both the
initiator and target to implement the capability. Most first
generation SAS disks do not implement TLR. So transport errors in the
form of "aborted commands" can be reported due to
corruption (e.g. caused by marginal cables) or congestion.
When
the retries=RETR option is given and RETR is greater
than 0 then most errors on a READ or WRITE SCSI command are retried
up to RETR times. Device not ready errors are not retried and
"unit attention" conditions are automatically retried
(without looking at or decrementing RETR). Once the number of
retries is exhausted on the same operation without success then ddpt
will refer to the 'coe' option as to what to do next. Each new
operation, READ or WRITE, or to a different logical block address has
its own retry count initialized to RETR.
The ddpt utility may be used as a copy "of last resort"
on failing media. ddpt will attempt to continue after errors termed
as "unrecoverable" or "medium" are encountered.
Other types of errors (e.g. lack of permissions on a device or an
interrupt signal) will terminate the copy at the point where they are
encountered. The general strategy is to replace unrecoverable blocks
on the input file with zeros; and to step over (i.e. essentially
ignore) medium errors on the output file. In both cases the alignment
of the input and output files is maintained.
Continue on
error can be invoked several ways: 'iflag=coe',
'oflag=coe', 'coe=1'
or 'conv=noerror,sync'. The first
variant requests coe on the input file, the second variant requests
coe on the output file; the third variant requests coe on both input
and output files although it may be ignored on the output file. The
last variant is for compatibility with the dd command and requests
coe on both input and output files.
The 'coe_limit=CL'
option is meant to stop ddpt continuing ad nauseam if unrecoverable
errors are being detected continuously on input. The input media may
be blank (e.g. unrecorded) or beyond its logical block address limit.
The default value of CL is zero and this is interpreted as not
having a limit. When coe is active additional counters are maintained
for unrecovered read and write errors. If they are non zero at the
end of the copy then they are printed out. If either appear then that
indicates an imperfect copy.
The actions of coe are
slightly different depending on the type of file. So there are two
subsections below, the first for coe used with block devices and
regular files; the second for coe used with pt (i.e. pass-through)
devices. If the reader intends to use coe on pt devices then it is
recommended to read both subsections.
Another dd variant
called dd_rescue (see www.garloff.de/kurt/linux/ddrescue/
) has similar "continue on error" facilities.
Continue on error (coe) can be requested for input block devices
or regular files. It is ignored for output block devices and regular
files meaning that any error on such an output file will terminate
the copy.
The standard dd command also has a
"continue on error" facility. These two invocations are
roughly the same:
ddpt if=/dev/sdc
iflag=coe of=sdc.img bs=512
dd
if=/dev/sdc of=sdc.img bs=512 conv=noerror,sync
Without
the 'count=COUNT'
option the whole of /dev/sdc will be
copied to sdc.img . The 'noerror'
argument to 'conv=' tells dd to
continue on error and the 'sync' tells
it to supply zeros for blocks that can not be read. If the 'sync'
is not given then the image file will be shorter when errors are
detected. As a convenience ddpt will accept 'conv=noerror,sync'
to mean 'iflag=coe '. So the following
invocation (followed by its output) is equivalent to the two above:
ddpt if=/dev/sdc of=sdc.img bs=512
conv=noerror,sync
16376+8 records in
16384+0 records out
8
unrecovered read errors
lowest unrecovered read lba=4656,
highest unrecovered lba=4663
time to transfer data: 5.822152
secs at 1.44 MB/sec
So 8 unrecovered read errors
were detected, starting at LBA 4656 until LBA 4663 inclusive. Like
dd, unrecovered read errors are counted as partial reads hence the
"16376+8 records in" line.
There are a few problems with this: there was actually only one
unrecoverable error at LBA 4660 on the device; the other (related)
problem is the slow overall read speed. In this case the block layer
is accessing the device with a block size of 4 KBytes (while the
device has 512 byte blocks). That block size mismatch leads to 7 good
512 byte blocks being "lost" (i.e. LBAs 4656 to 4659 and
4661 to 4663 are valid but will be all zeros in sdc.img).
Both dd and ddpt support "iflag=direct"
which bypasses the block layer's buffering. So using "iflag=direct"
is recommended; the improvement can be seen in the following
example:
ddpt if=/dev/sdc
iflag=coe,direct of=sdc.img bs=512
16383+1
records in
16384+0 records out
1
unrecovered read error
lowest
unrecovered read lba=4660, highest unrecovered lba=4660
time
to transfer data: 1.026536 secs at 8.17 MB/sec
Notice
that only the unrecoverable LBA (i.e. 4660) has been found this time.
Also the overall read speed is faster. Unrecoverable errors on
spinning media (as distinct from Solid State Drives (SSDs)) often
occur within a range of LBAs, possibly due to physical damage
on the media. To help in identifying bad regions the lowest and
highest unrecovered LBA are shown at the end of the copy.
Continue on error (coe) can be requested for input and output pt
devices. Write errors on pt devices are reported and ignored and the
ddpt utility continues. [In the case where media errors are causing
write errors the user should check the setting of the AWRE bit in the
SCSI "read write error recovery" mode page (see SBC-3 at
http://www.t10.org).]
When
a SCSI READ command detects an unrecoverable read error it responds
with a sense key of MEDIUM ERROR or HARDWARE ERROR. Additionally it
responds with the logical block address of the first (lowest) block
that it failed to read in the current READ command. Any valid blocks
prior to the "bad" block may or may not have been
transferred. If coe is not given then the ddpt utility will simply
terminate at this point (with a reasonable amount of debug
information sent to stderr) and good blocks prior to the bad
block may not be copied (depending on the setting of BPT). If
coe' is active then the first thing ddpt will try to do is a
truncated read up to, but not including, the bad block.
There
still remain blocks after the "bad" block that need to be
fetched. Further bad blocks may be detected and if so the algorithm
in the last paragraph is repeated. The result of this process is an
imperfect copy with blocks that were read properly placed in the
correct relative position in OFILE.
While accessing
a block device via a pt is often a "win" for error
processing it is not always so. There are lots of "SCSI"
devices that have substandard error reporting (e.g. many USB mass
storage devices). MEDIUM ERRORs are sometimes reported without the
LBA of the lowest block in error or it is misreported. The coe logic
can compensate for some known bad error reporting cases (e.g. many
CD/DVD/BD players cause problems due to standards (i.e. SPC and MMC)
conflict). Error reporting irregularities may lead to the early
termination of ddpt even when coe is given. In the author's
experience, good error reporting separates the professional products
from the rest.
When trying to recover data from a badly
flawed device, it is important not to blindly re-read bad blocks as
this will absorb a large amount of time. Most modern devices will not
report a bad block until they have retried internally many times and
using different techniques. Many OS block layers think they are being
helpful in adding their own layer of retries. This often leads to
very slow copy/recovery performance.
The 'retries=RETR'
option is also available to pt devices. Depending on the settings of
the SCSI Read-Write error recovery mode page, doing retries on medium
errors may or may not be worthwhile. Often a disk has retried reading
a block multiple times before reporting a medium error. However there
are other sorts of errors (e.g. transport errors) that are often
worth retrying. The 'retries=RETR'
option logic is applied (and if necessary exhausted) before the coe
logic starts its work.
Here is an example following on
from those in the previous subsection:
ddpt
if=/dev/sdc iflag=coe,pt of=sdc.img bs=512
>>
unrecovered read error at blk=4660, substitute zeros
16383+1
records in
16384+0 records out
1
unrecovered read error
lowest
unrecovered read lba=4660, highest unrecovered lba=4660
time
to transfer data: 0.520079 secs at 16.13 MB/sec
An
additional feature of coe on pt devices is that unrecovered blocks
are reported as they are detected.
Often errors are recovered using ECC data or by the device
retrying (usually re-reading) the media. Typically at the first sign
of trouble, recoverable errors lead to the block in question being
reassigned to another location on the media (automatically when the
AWRE and ARRE bits are set in the "read write error recovery"
mode page). The user may be blissfully unaware that the media
may be reaching the end of its useful life. Error counters are
maintained in the "Read error counter" and "Write
error counter" logs pages which can be viewed with smartctl
(from smartmontools) and sg_logs (from the sg3_utils package). Any
block that is automatically or manually re-assigned adds a new entry
to the "grown" defect list which can be viewed with 'sginfo
-G' or 'sg_reassign -g' (both
found in the sg3_utils package).
A SCSI storage device can
be instructed to report RECOVERED ERRORs by setting the PER bit in
the "read write error recover" mode page. Most often this
bit is clear. When ddpt detects RECOVERED ERRORs it reports them,
counts them and continues the copy. Only the LBA of the last
recovered error in a READ or WRITE SCSI command is reported so there
could be more than one recovered error per SCSI command. The bpt=1
option could be chosen to limit every SCSI command to a single block
transfer (but that would slow things down a fair amount). If the
count of recovered errors is greater than zero at the end of the copy
then this count is output as well.
There can be other
reasons for a sense key of RECOVERED ERROR not directly related to
data being currently read or written. SMART alerts (called in SCSI
documents "Informational Exceptions") can be conveyed via a
RECOVERED ERROR sense key (see the MRIE field in the Informational
Exceptions mode page). Such alerts have additional sense codes like
"Failure prediction threshold exceeded" and those that
contain "impending failure".
Unix file systems usually allow the file pointer to be moved (e.g.
with the lseek() system call) around
arbitrarily. Moving the file pointer around can result in "holes"
or overwriting of existing data in a file. An output file containing
"holes" is what the term sparse writes refers to.
When such a sparsely written file is read then each "hole"
is interpreted as a sequence of zero bytes. In the case of a regular
file, one way of detecting its "sparseness" is to compare
the output of the du and ls -l Unix commands.
The
concept of sparse writes also is useful for block devices accessed
via standard Unix commands. And it is useful for direct access
devices accessed via a SCSI command set (such as SBC for disks or MMC
for cd/dvds). The underlying assumption here is that the device
already has zero bytes in the blocks that are not explicitly written
to. A SCSI disk just after the FORMAT command has been successfully
performed typically contains zeros in all its blocks. Another way to
"zero" a SCSI disk is with a WRITE SAME command (the ATA8
SCT feature set also contains a WRITE SAME command).
When
ddpt utility is given 'oflag=sparse' or 'oflag=strunc' flag, it will
check each copy buffer fetched from IFILE for zeros. The size
of each copy buffer, perhaps apart from the last buffer, is IBS*BPT
bytes (see the OBPC below). If the buffer is full of zeros
then the corresponding write to OFILE is bypassed.
Bypassing
writes of blocks full of zeros can save a lot of IO. However with
regular files, bypassed writes at the end of the copy can lead to an
OFILE which is shorter than it would have been without sparse
writes. This can lead to integrity checking programs like md5sum
and sha1sum generating misleading
values.
This utility has two ways of handling this file
length problem: writing the last block (even if it is full of zeros)
or using the ftruncate system call. A third approach is to ignore the
problem (i.e. leaving OFILE shorter). The ftruncate approach
is used when 'oflag=strunc' is given, while the last block is written
when 'oflag=sparse'. To ignore the file length issue use
'oflag=sparse,sparse'. Note that if OFILE's length is already
correct or longer than required, no action is taken.
The
support for sparse writing of regular files may depend on the OS, the
file system and the settings of OFILE. POSIX makes few
guarantees when the ftruncate system call is used to extend a file's
length, as may occur when 'oflag=strunc'. Further, primitive file
systems like VFAT may not accept sparse writes or simulate the effect
by writing blocks of zeros. The latter approach will defeat any
sparse writing performance gain.
A sparse file may also be
created by ddpt by using the
'seek=SEEK' option. Here is an example:
$
ddpt if=/dev/zero of=t seek=1m bs=1024 count=1
1+0 records
in
1+0 records out
time to transfer data: 0.000156 secs at
6.56 MB/sec
$ ls -lh t
-rw-rw-r--
1 fred fred 1.1G 2007-06-28 22:15 t
$
du -h t
12K
t
The above shows that even though the file system
knows the sparse file is (logically) 1.1 GB long, it only consumes 12
KB of space within the file system. In the above case, ddpt is
producing the same result as the standard dd command. Programs
that calculate checksums such as md5sum and sha1sum
should give the same result when applied to either a sparse file or
the corresponding non-sparse file.
For speed, it is best
to have BPT at its default value or larger. However doing
sparse write checks for zeros in units of IBS*BPT bytes
may be too large, missing the chance to bypass many
writes. The OBPC option allows the granularity of the check
buffer to be reduced, the minimum being one OBS [the output
(logical) block size]. Values of OBPC that imply a check
buffer larger than IBS*BPT bytes are rounded back to
OBPC=(BPT*IBS)/OBS . The default value of OBPC (0) also
uses a check buffer of IBS*BPT bytes.
Write sparing is most useful when a significant proportion of the
data to be written is expected to be identical to the data already
there, and where writing is slower than reading, or where write
endurance is limited (e.g. SSD, USB flash drive or memory card). For
example, suppose you have a bootable USB flash drive which you
regularly back up to an image file on your SSD. Using write sparing
will greatly reduce the amount of data that needs to be written,
since most data will match that in the already-existing previous
image file.
With write sparing, after reading the IFILE,
the corresponding segment in the OFILE is read into a
second buffer and the two buffers are compared. If unequal, the write
of the original buffer to OFILE takes place as normal. If
equal then the write to OFILE is bypassed. The OFILE
should exist and be readable and seek-able (hence stdout is not
appropriate). OFILE's length may be shorter than that of
IFILE.
It seems unlikely that it would be useful to
have both sparse writes and write sparing active on the same OFILE.
If they are both given (i.e. 'oflag=sparing,sparse') then sparse
writes are checked first and if zeros are found, the check for write
sparing is bypassed on that segment.
This is a storage feature often associated with Solid State Disks
(SSDs) or disk arrays with "thin provisioning". In the ATA
command set (ACS-2) the relevant command is DATA SET MANAGEMENT with
the TRIM bit set. In the SCSI command set (SBC-4) it is either the
UNMAP or WRITE SAME commands. Note there is no TRIM command however
this feature has been christened "trim" by the technical
press.
Trim is a way of telling a storage device that
blocks are no longer needed. Keeping the pool of unwritten blocks
large is important for the write performance of SSDs and the thrifty
use of real storage in thin provisioned arrays. Currently file
systems in recent OSes may issue trims associated with file deletes.
The trim option in ddpt may be useful when a partition or a whole SSD
is to be "deleted". Note that ddpt is bypassing file
systems in that it only offers trim on pass-through (pt)
devices
This utility issues SCSI commands to pt devices
and for "trim" currently issues a SCSI WRITE SAME(16)
command with the UNMAP bit set. If the pt device is a SSD with a ATA
interface then recent versions of Linux will translate the SCSI WRITE
SAME command to the ATA DATA SET MANAGEMENT command with the TRIM bit
set. The maximum size of each "trim" command sent is the
size of the copy buffer (i.e. IBS * BPT bytes). And that maximum can
be reduced with the OBPC argument of the 'bpt=' option.
In the Unix style, ddpt doesn't output anything (to stderr) during
large IO transfers. To get a progress report the SIGUSR1 signal can
be sent to the ddpt process. In the Unix dd command style, ddpt
outputs two lines on completion that show the number of full and
partial records in (on the first line) and out (on the second
line).
ddpt has a 'verbose=' option whose default value is
zero. When set to these values 'verbose=' has the following actions:
show categorization and INQUIRY data (where applicable) for the input and output files. For files, other than streams, the file/device size (and device block size) are output.
same output as 1 plus data for Unix and SCSI commands (cdbs) that are not repeated (i.e. other than Unix read/write and SCSI READ/WRITE commands). Increased error reporting for all SCSI commands
same output as 2 plus data for Unix and SCSI commands (cdbs) that are repeated. For a large copy this will be a lot of output.
maximum amount of debug output. For a large copy this will be a lot of output.
All verbose output is sent to stderr so that ddpt with "of=-"
(copy output to stdout) is not corrupted.
Following is an
example of using verbose=1 to find information about /dev/sda
. If no copy is required then setting count=0 will see to that. Since
/dev/sda is a block device then it
would normally be accessed via Unix system commands. The verbose=1
output is relatively short to non pt devices. The second invocation
is with 'iflag=pt' and more is output. That includes INQUIRY standard
response data (e.g. "SEAGATE ..." line). See the SBC-2
drafts at www.t10.org for more
information.
$ ddpt if=/dev/sda
bs=512 verbose=1 count=0
>> Input file type: block
device
open input,
flags=0x0
>> Output file type: null device
/dev/sda
[blk]: blocks=625142448 [0x2542eab0], block_size=512, 320 GB
approx
skip=0 (blocks on input), seek=0 (blocks on output)
initial count=0 (blocks of input), blocks_per_transfer=128
0+0
records in
0+0 records out
time to read data: 0.000028
secs
# ddpt if=/dev/sdb iflag=pt bs=512 verbose=1
count=0
>> Input file type: pass-through [pt] device
block device
/dev/sdb: Linux
scsi_debug 0004
[pdt=0]
>> Output file type: null device
/dev/sdb
[pt]: blocks=16384 [0x4000], block_size=512, 8 MiB approx
skip=0
(blocks on input), seek=0 (blocks on output)
initial
count=0 (blocks of input), blocks_per_transfer=128
0+0 records
in
0+0 records out
time to read data: 0.000031 secs
As
an experimental feature setting 'verbose=-1' will map stderr to
/dev/null so that no debug messages and
copy summary will appear.
It may be useful to copy a partition on a disk. To do this the
partition table may need to be read, preferably in units that are
useful for ddpt. Following is an example of the GNU parted utility
where "unit s" means in units of the logical block size
(e.g. 512 bytes):
# parted /dev/sda
unit s print
Model: ATA FUJITSU
MHY2160B (scsi)
Disk /dev/sda:
312581808s
Sector size
(logical/physical): 512B/512B
Partition
Table: msdos
Number
Start End
Size Type
File system Flags
1
2048s 14280703s
14278656s primary ntfs
2
14280704s 156299263s 142018560s primary
ntfs
boot
3
156310560s 312575759s 156265200s extended
5
156310623s 310700879s 154390257s logical
ext3
6
310700943s 312575759s 1874817s
logical linux-swap(v1)
Assume we want
to copy the whole "2" partition to a file, that could be
done a few ways:
ddpt if=/dev/sda2 of=/tmp/a.bin bs=512
ddpt if=/dev/sda skip=14280704 of=/tmp/b.bin bs=512 count=142018560
ddpt if=/dev/sda3 iflag=pt skip=14280704 of=/tmp/c.bin bs=512 count=142018560
So if the /dev/sda2 is named then
'skip=' and 'count=' are not needed unless 'iflag=pt' is given. Once
'iflag=pt' is given any variants of /dev/sda
(e.g. /dev/sda1 /dev/sda2
/dev/sda3 etc) map back to /dev/sda
because SCSI commands are not partition aware. From ddpt version 0.92
the third case (i.e. 'if=/dev/sda3 iflag=pt')
would be aborted with a warning; to override that 'iflag=pt,force'
would be required.
As a double check the ddpt 'verbose=1
count=0' test will show the size of a what is being considered:
#
ddpt if=/dev/sda2 bs=512 verbose=1 count=0
>>
Input file type: block device
open input, flags=0x0
>>
Output file type: null device
/dev/sda2
[blk]: blocks=142018560 [0x8770800], block_size=512, 72 GB
approx
...
#
ddpt if=/dev/sda2 iflag=pt,force bs=512 verbose=1 count=0
>>
Input file type: pass-through [pt] block device
/dev/sda2: ATA
FUJITSU MHY2160B 0000 [pdt=0]
>>
Output file type: null device
/dev/sda2
[pt]: blocks=312581808 [0x12a19eb0], block_size=512, 160 GB
approx
...
Partition
table information can also be obtained with the fdisk utility. The
output is a little more messy than parted for the same 160 GB disk.
Note the partition size is in the "Blocks" column and seems
to be in 1024 byte units:
# fdisk
-ul /dev/sda
Disk /dev/sda:
160.0 GB, 160041885696 bytes
255
heads, 63 sectors/track, 19457 cylinders, total 312581808
sectors
Units = sectors of 1 * 512
= 512 bytes
Disk identifier:
0x79cbdc8f
Device
Boot Start
End Blocks Id
System
/dev/sda1
2048 14280703 7139328
27 Unknown
Partition 1 does
not end on cylinder boundary.
/dev/sda2
* 14280704 156299263
71009280 7 HPFS/NTFS
Partition
2 does not end on cylinder boundary.
/dev/sda3
156310560 312575759 78132600
5 Extended
Partition 3 does
not end on cylinder boundary.
/dev/sda5
156310623 310700879 77195128+ 83
Linux
/dev/sda6
310700943 312575759
937408+ 82 Linux swap / Solaris
When
copying a partition to a file, a lot of storage may be required.
Using 'oflag=sparse' may save space. Copying a small amount first and
checking the OFILE with a utility like 'hexdump
-C' may confirm that the full copy will be worthwhile.
XCOPY is a common shortening for the EXTENDED COPY facility
introduced in SPC-2 (ANSI INCITS 351-2001). EXTENDED COPY was
enhanced in SPC-3 and now as SPC-4 approaches standardization the
facility has become a lot larger . The original facility introduced
in SPC-2 is now called EXTENDED COPY(LID1) where "LID1"
means it has a List IDentifier length of 1 byte. There is now a
"LID4" variant (with 4 byte list identifiers) with added
flexibility and complexity. A subset of EXTENDED COPY(LID4) that
supports token based didk to disk copies was proposed with the name
"xcopy version 2, lite" and accepted. It is based on two
new SBC-3 commands: POPULATE TOKEN and WRITE USING TOKEN. Microsoft
has integrated this subset into its servers and given it the name:
Offloaded Data Xfer (ODX).
Individual SCSI and ATA
disks do not typically support xcopy; disk arrays and iSCSI servers
do. Support for xcopy(LID1) has been added to the Linux target
subsystem. There is a 3PC field in a standard SCSI INQUIRY response
that when set indicates a "logical unit" (LU: an
abstraction of a disk) supports at least some xcopy functionality. If
the 3PC field is not set in both the source LU and the destination LU
then a xcopy operation is extremely like to fail.
Support
for disk to disk copies using EXTENDED COPY(LID1) was added to the
sg3_utils package with the sg_xcopy utility (and its companion:
sg_copy_results). That sg_xcopy functionality was ported into ddpt in
version 0.93 and is referred to in these pages as "xcopy".
In version 0.94 of ddpt ODX support has been added and is referred to
in these pages as "odx". A new companion utility called
ddptctl adds some extra odx functionality such as listing and
decoding ROD Tokens and the ability to abort a copy in progress.
See
ddpt_xcopy_odx for more
information.
On Linux systems, ddpt can also work with tape drives via the "st"
SCSI tape driver. On Debian-based distributions, it is suggested that
you install the mt-st package, which provides a more fully-featured
version of the "mt" tape control program (see 'man mt' for
more details).
Tape drives can operate in fixed- or
variable-length block modes. In variable-block mode, each write to
the tape writes a single block of that size. In fixed-block mode,
each write to the tape must be a multiple of the previously-selected
block size. The block size/mode can be set with the mt command prior
to invoking ddpt. For example:
# mt
-f /dev/nst0 setblk 0
sets variable-block mode, and
#
mt -f /dev/nst0 setblk 32768
sets fixed-block mode with
block size 32768 bytes.
Note that some tape drives support
only fixed-block mode, and possibly even only one block size. (For
example, QIC-150 tapes use a fixed block size of 512 bytes.) There
may also be restrictions on the block size, e.g. it may have to be
even.
When using ddpt to write to tape, if the final read
from the input is less than OBS, it is padded to OBS bytes before
writing to tape to ensure that all blocks of the tape file are the
same length. Having a shorter final block would fail if the drive is
in fixed-block mode, and could create interchange problems. It is
common to expect all blocks in a file on tape to be the same length.
However, to tell ddpt to not pad the final block, use oflag=nopad
.
The st tape driver normally writes a filemark when the
file (/dev/nst0 etc.) is closed. If you prefer to not have the
filemark written, use oflag=nofm . One use case for that might be if
using ddpt several times in succession to append more data to the
same file on tape. In that case you will probably want to ensure that
a filemark gets written at the end. So either omit oflag=nofm on the
last ddpt invocation, or manually write a filemark using mt after
ddpt exits:
# mt -f /dev/nst0 weof
1
For reading from an unknown tape where you don't
know which block size(s) were used, you can read in variable-block
mode specifying a large IBS. The st driver returns a smaller amount
of data if the size of the block read is smaller. Thus a command
like:
# ddpt if=/dev/nst0
of=output.bin bs=262144
should read the file from tape
regardless of the block size used (assuming no blocks are larger than
256KB). You can use the verbose option to have ddpt tell you what the
actual block size(s) is.
Tape users may be interested in
this virtual tape library project: mhvtl
.
These are helper utilities for ddpt. Both have standard Unix option syntax. So both have long options starting with "--" and short options starting with a single "-".
Most of these examples use Linux device names. See the device
naming page for appropriate device names in other supported
operating systems.
To start with, read 1024 blocks, each
of 512 bytes, from a block device. Notice there is no 'of=<OFILE>'
argument so the output goes to /dev/null
(i.e. it gets thrown away). [Beware: the dd command defaults to
sending the output to stdout (often making a mess on the screen)].
#
ddpt if=/dev/sda bs=512 count=1k
Output
file not specified so no copy, just reading input
1024+0
records in
0+0 records out
time
to read data: 0.013480 secs at 38.89 MB/sec
Now to
access the same device (assuming /dev/sda
and /dev/sg0 refer to that device) via
the pass-through interface use either:
#
ddpt if=/dev/sda iflag=pt bs=512 count=1k
Output
file not specified so no copy, just reading input
1024+0
records in
0+0 records out
time
to read data: 0.005916 secs at 88.62 MB/sec
#
ddpt if=/dev/sg0 bs=512 count=1k
Output
file not specified so no copy, just reading input
1024+0
records in
0+0 records out
time
to read data: 0.005945 secs at 88.19 MB/sec
The
first form needs the 'iflag=pt' option because given a device name
that can be accessed via either a block device interface or a
pass-through interface, ddpt will default to using the block device
interface. In the second form the /dev/sg0
device only supports the pass-through interface.
To copy
from a block device to a file:
#
ddpt if=/dev/sdb of=t.img bs=512 count=64
64+0
records in
64+0 records out
time
to transfer data: 0.179983 secs at 0.18 MB/sec
This
copies 32 KB from the beginning of /dev/sdb
to the file t.img .
Now a
bit more ambitious: to copy from one block device to another. Beware
that writing to a block device is an irreversible operation so take
care. To have a closer look at what might happen use the combination
of 'count=0 verbose=2' to check things:
#
ddpt if=/dev/sda of=/dev/sdb oflag=pt bs=512 count=0 verbose=2
>> Input file type: block device
open input, flags=0x0
>> Output file type: block
device
open /dev/sdb with flags=0x802
inquiry cdb: 12 00 00 00 24 00
/dev/sdb:
Linux scsi_debug
0004 [pdt=0]
/dev/sda [blk]: blocks=625142448
[0x2542eab0], block_size=512, 320 GB approx
read capacity (10) cdb: 25 00 00 00 00 00 00 00 00 00
/dev/sdb
[pt]: blocks=2048 [0x800], block_size=4096, 8 MiB approx
>>
warning: /dev/sdb block size confusion: obs=512, device
claims=4096
skip=0 (blocks on input), seek=0 (blocks on
output)
ibs=512 bytes, obs=512 bytes, OBPC=0
initial count=0 (blocks of input), blocks_per_transfer=128
0+0
records in
0+0 records out
time to transfer data: 0.000034
secs
To add some variation, the 'pt' option was
selected on the output block device. It is not necessary to
understand all the details. The main points are that the 'count=0'
makes sure no data is actually written so no damage is done. The
'count=0' argument causes the size of the disks (in blocks) and the
logical block size of the disks to be examined. This highlights a
problem noted on the line starting with ">>
warning:" and that is that the logical block size of
/dev/sdb is 4096 bytes. Since 'obs=OBS'
has not been specified then OBS is assumed to be the same as BS which
is 512 bytes. So the invocation should have been:
#
ddpt if=/dev/sda ibs=512 of=/dev/sdb oflag=pt obs=4096 count=0
verbose=2
Grouping the arguments helps as does
using 'ibs=512' rather than 'bs=512' in this case. Note that the
count argument, if given, is in units of IBS. If we do the
copy then the size of the smaller disk dictates the number of blocks
moved (if the count argument is not given or is too large):
#
ddpt if=/dev/sdb ibs=512 of=/dev/sdc oflag=pt obs=4096
3477504+0
records in
434688+0 records out
time to transfer data:
108.931869 secs at 16.34 MB/sec
In this case
/dev/sdb was the smaller (about 1.8
GB). There are eight times as many "records in" than
"records out" reflecting the different block sizes.
In
the next case the size of /dev/sdc
becomes the limiting factor due to the seek=2097000
option:
# ddpt if=/dev/sdb ibs=512
of=/dev/sdc oflag=pt obs=4096 seek=2097000
1216+0
records in
152+0 records out
time
to transfer data: 0.063875 secs at 9.75 MB/sec
The
sparse writing flag can be used to count the number of blocks
containing zeros. Here a lightly used partition is being checked 2 GB
after its start and 5 GB is being checked:
#
ddpt if=/dev/sda7 skip=4m bs=512 oflag=sparse count=10m
Output
file not specified so no copy, just reading input
10485760+0
records in
0+0 records out
8895616
bypassed records out
time to read
data: 138.777383 secs at 38.69 MB/sec
Actually a
copy buffer (128 blocks in this case) is being checked for zeros so
not all of the blocks containing all zeros is counted. A more
accurate count can be made by setting OBPC to 1. However the
execution time suffers slightly:
#
ddpt if=/dev/sda7 skip=4m bs=512 oflag=sparse count=10m
bpt=128,1
Output file not
specified so no copy, just reading input
10485760+0
records in
0+0 records out
8908836
bypassed records out
time to read
data: 136.614629 secs at 39.30 MB/sec
A job file
called read_check.jf might contain this:
# Read given IFILE as pt device and check, continue on error
iflag=pt,coe # access as pt device (may fail depending on OS)
bs=512
verbose=1 # could be increased
for more noise
and be used like this:
ddpt --job=read_check.jf if=/dev/sdc
Note
that /dev/sdc is used as a pass-through
device which will yield better (or at least lower level) error
reporting if a problem is found.
There are also some
examples in the ddpt man page and the doc/ddpt_examples.txt
file in the distribution tarball.
The tarball contains the source and can be built with a './configure ; make ; make install' sequence. In some cases executing the './autogen.sh' script prior to './configure' may be required.
Table 7. ddpt tarballs and packages
ddpt version |
tarball |
i386 rpm binary |
debian package |
0.91 |
|||
0.92 |
|||
0.93 |
|||
0.94 |
|||
0.95 |
ddpt-0.95.tgz ,
ddpt-0.95.tar.xz |
||
0.96 20200303 |
|||
0.97 20210421 |
The Windows executable was made in a MinGW environment. Here
is the most recent ChangeLog and Unix
style manpages for ddpt, ddptctl
and ddpt_sgl in html.
This
utility shares code with the sg3_utils package, specifically a
library called libsgutils. If available during the build, the
libsgutils library will be used at runtime. If the library is not
detected during the built, the required code is built into the
executable (making it slightly larger). For ease of use, the binary
packages in table 7 do not depend on libsgutils but distributions
(e.g. Red Hat and Debian) prefer to factor out common code.
The Open Group Base Specifications Issue 7 (also known as SUSv4)
have a useful definition of the basic Unix dd command: see
http://www.opengroup.org/onlinepubs/9699919799
and select "Shell & Utilities" (on the left) then
select "Utilities" (on the lower left), and finally select
"dd" (from the list in the lower left).
When a
pass-through interface is used, the ddpt utility issues SCSI commands
that are defined in SPC-4 (primary commands), SBC-3 (commands for
direct access devices (e.g. disks)) and MMC-5 (commands for CD/DVD
devices). These SCSI command sets can be found at www.t10.org
. When the storage device is an ATA disk (e.g. a SATA disk) a SCSI to
ATA Translation layer (SATL compliant with SAT or SAT-2) is assumed.
Return to main page.
Last updated: 22nd April 2021