ddpt utility (a Unix dd command variant)

The ddpt utility

Introduction

The ddpt utility is a variant of the standard Unix command dd which copies files. The ddpt utility specializes in files that are block devices. For block devices that understand the SCSI command set, finer grain control over the copy may be available via a SCSI pass-through interface. ddpt has been developed for Linux and ported to FreeBSD, Solaris and Windows.

The types of block devices that are supported are disks (known as direct access devices in SCSI) and cd/dvd/bd devices. It is becoming more common for ATA disks (especially SATA) to be accessed by an operating system using SCSI commands. ATA disks are not always directly connected and transports such as USB, IEEE1394 (FireWire) and iSCSI use SCSI commands. Protocol translation from SCSI to ATA ("SAT", first standardized in 2007) has appeared in OSes and external devices (e.g. recent USB disk enclosures) and many implementations are mature enough for ddpt to use. Data can also be copied to and from NVMe disks.

The ddpt utility is a more generic version of the Linux specific sg_dd and sg_xcopy utilities found in the sg3_utils package. Tape drives are only supported in Linux; and only via the mtio interface associated with st devices.

The ddpt utility supports two types of offloaded copy. They are referred to as "xcopy" and "odx". See the see ddpt_xcopy_odx page for more information.

This page outlines the features of the ddpt utility version 0.97 . The ddpt utility is found in the package of the same name.

New ddpt features

Some features found in ddpt which are not present in the (GNU) dd implementation:

odx offloaded copy (SCSI EXTENDED COPY(LID4) subset)
xcopy (SCSI EXTENDED COPY(LID1)) to offloaded copies via pt interface
support for SCSI protection information (DIF)
introduce bpt=BPT (blocks per transfer) so bs=BS is kept as logical block size
generalize skip=SKIP and seek=SEEK to accept scatter gather lists (sgls)
bandwidth limiting via delays between each copy segment
pre-allocate output file prior to copying to it; to reduce fragmentation
write sparing (i.e. don't write buffer if already same as destination)
resume (after the copy has been interrupted)
trim on output of copy, self trim (pt interface only)
send output to a second file (see 'of2=' option)
when the --verify option is given IFILE is read and that data is sent (but not written) to OFILE in a SCSI VERIFY( BYTCHK=1) command.
the --prefetch option may accompany --verify: a SCSI PRE-FETCH(IMMED) on OFILE precedes the SCSI READ on IFILE
put options in a job file, to save retyping on each invocation
access devices directly via pass-through (pt) interface, bypassing the kernel block layer
NVMe disks can be accessed either as block devices or via a SNTL via the pass-through (pt) interface (Linux only)
accept numeric command line arguments in hexadecimal
explicit controls over how much data is read into the copy buffer and then written to output (separate from the logical block sizes of any device involved)

Sparse writing is an important ddpt feature that is now found in recent versions of the GNU dd implementation. Several dd defaults have been changed, usually in an effort to ease ddpt copying, or simply reading, large amounts of data. Also by default dd attempts to truncate its output file prior to the copy while ddpt defaults to overwriting the output file.

For example with dd if the 'of=' option is not given, large amounts of data (often binary) will be sent to the console (stdout) making it difficult for the inexperienced user to understand what has happened. With ddpt if the 'of=' option is not given then nothing is output (equivalent to outputting it to /dev/null) effectively making the invocation a read rather than a copy.

Basics

The basic syntax of the ddpt utility is the same as the dd command in Unix. That said, the syntax of the dd command in Unix is different from almost all other Unix commands. Those familiar with the dd command should not be too surprised by the syntax and semantics of this utility. For those unfamiliar, special care should be taken, especially with the 'of=' and 'seek=' options, both with dd and ddpt. Wikipedia has an informative page with examples on the Unix dd command.

It is not that important but the document will use the term 'operand' to refer to the <name>=<argument> construct and use the term option to refer to command line elements starting with '-' or '--'.

There are multiple definitions and implementations of dd. The simplest current definition is from POSIX.2008 (aka SUSv4). The GNU version of dd is probably the most implemented and adds the 'iflag=' and 'oflag=' options. FreeBSD has its own implementation which does not have 'iflag=' and 'oflag=' options but adds 'conv=sparse'. The recent GNU implementation is used as a reference point. The fundamental options of dd are:

Basic dd operands and option	dd default	ddpt default	Brief description
bs=BS	IBS, OBS or 512	IBS, OBS or 512	Number of bytes in each input and output block. Sets IBS and OBS.
count=COUNT	blocks in IFILE	blocks in IFILE	Number of input blocks to copy.
if=IFILE	stdin	[none]	file (or device) to read from.
of=OFILE	stdout	/dev/null	file (or device) to write to.

Table 1 Fundamental dd options

When either dd or ddpt are given these options with suitable arguments, they will copy (IBS * COUNT) bytes from the beginning of IFILE to the beginning of OFILE. Note the different defaults for 'if=' and 'of=' between dd and ddpt; while defaulting to stdin and stdout may be more in keeping with a Unix filter type command, in practice the filter syntax is not used much for ddpt. The author feels no default for 'if=' and /dev/null (or the Windows equivalent: NUL) for 'of=' are more useful and safer.

ddpt differs from dd as follows. An IFILE of "-" is interpreted as stdin; an OFILE of "-" is interpreted as stdout while an OFILE of "." is interpreted as /dev/null. [dd interprets input and output file names of "-" literally; dd interprets an output file of "." as the current directory and will not accept it.] By default the ddpt utility does not truncate the OFILE before starting the copy (the dd command does if it is a regular file). ddpt has a 'oflag=trunc' option (or 'conv=trunc' option) that will truncate the OFILE before starting the copy. For output block devices (including those accessed via the pt interface) ddpt writes integral multiples of OBS bytes to the OFILE so it does not do partial writes, ignoring them in the case of the last copy segment. For regular output files (including fifos and stdout) ddpt can do partial writes (e.g. the last write is not a multiple of OBS bytes) to OFILE. Note that since regular OFILEs are not truncated by default the length of OFILE may end up larger than the length of IFILE.

If the 'count=' option is not given then an attempt is made to determine the remaining blocks in the file, device or partition. If the input file is stdin and no count is given then a copy will continue until an EOF is detected on the input stream (or something else goes wrong). If the 'count=' option is not given then the remaining blocks on both the input and output files are determined (if possible) and if both are found then the minimum of the two counts is used. The 'skip=' option for IFILE and the 'seek=' option for OFILE are taken into account when calculating the remaining number of blocks in a file, device or partition.

If the 'count=' option is given then no further checks regarding the remaining length of IFILE and OFILE are done and the ddpt will attempt to copy that number of blocks. The 'count=0' option is valid and all the normal preparations are made including opening files but no copy takes place. Hence the 'count=0' option can be used to check that the syntax is in order and that the files are present (see the "Verbose" section below).

Other dd options also supported by ddpt:

dd operands	Brief description
cbs=CBS	ddpt accepts but ignores this dd option ("conversion block size")
conv=CONV	see section on Conversions below
ibs=IBS	number of bytes in each block of IFILE (default: 512)
iflag=FLAGS	similar to option found in recent GNU dd versions, see below
obs=OBS	number of bytes in each block of OFILE (default: 512)
oflag=FLAGS	similar to option found in recent GNU dd versions, see below
seek=SEEK	block number (LBA) in OFILE to commence writing (default: 0). In ddpt (but not dd) it may also be a scatter gather list may up of starting_LBA,number_of_blocks pairs (default: 0,0).
skip=SKIP	block number (LBA) in IFILE to commence reading (default: 0). In ddpt (but not dd) it may also be a scatter gather list may up of starting_LBA,number_of_blocks pairs (default: 0,0).
status=STAT	accepts 'noxfer' to suppress timing and throughput information or 'none' to suppress all trailing reports (apart from errors). From version 0.96 it also accepts 'progress' for progress reports every 2 minutes. 'progress,progress' generates a progress report every minute. When used 3 times that is shortened to 30 seconds.
dd options
--help -h	print usage message then exit. '-h' option is equivalent.
--version -V	print version number and release date then exit. '-V' option is equivalent.

Table 2 Other dd options also supported by ddpt

If the 'bs=BS' option is given then both IBS and OBS are set to BS. If the 'bs=BS' option is given then the presence of either 'ibs=IBS' or 'obs=OBS' option is a syntax error. If both 'ibs=IBS' and 'obs=OBS' are given and differ then (IBS * BPT) must be divisible by OBS, without any remainder. [BPT is input "blocks per transfer" and is explained below.] For example, if a disk with 512 byte blocks (hence 'ibs=512') is being copied to another disk with 4096 byte blocks (hence 'obs=4096') then the BPT value should be 8 (or a multiple of 8). So in this case the BPT default of 128 is acceptable.

Modern storage is typically addressed in terms of a Logical Block Address (LBA) which starts at 0 for the first logical block and finishes with a LBA that is one less than the device size (measured in logical blocks, typically 512 or 4096 bytes long each). Prior to LBAs various schemes such as "Cylinder-head-sector" were used (see Wikipedia) that reflected the physical architecture of "hard" disks at the time.

The 'skip=SKIP' option, for SKIP greater than 0, requires IFILE to be seek-able or at least not give an error when the file pointer is moved (e.g. using the lseek() system call on /dev/zero doesn't cause an error in Unix). The 'seek=SEEK' option, for SEEK greater than 0, requires OFILE to be seek-able or at least not give an error when the file pointer is moved (e.g. using the lseek() system call on /dev/null doesn't cause an error in Unix). ddpt does not do dummy reads, as dd does, if an attempt to move a file pointer fails.

All numeric arguments can take a multiplier suffix. These multiplier suffixes are the same as those of GNU's dd (posted 2001-12-18):

Multiplier	Meaning multiply associated number by
x<n>	<n> [e.g. '2x512' yields 1024]
c	1
w	2
b	512
k K KiB	1024
KB	1000
m M MiB	1048576
MB	1000000
g G GiB	2**30
GB	10**9
t T TiB	2**40
TB	10**12

Table 3 Multiplier suffixes for numeric arguments

The pattern that starts with "k" and proceeds to "m", "g" and "t" then to "p", "e", "z" and "y" (not shown in above table). ddpt only implements as far as "p" (10**15 or 2**50). ddpt only allows multipliers based on "t" and "p" for COUNT, SKIP and SEEK.

ddpt allows numeric arguments to be given in hexadecimal in which case they can be prefixed by either "0x" or "0X". A numeric argument cannot both be in hex and have a suffix multiplier. Hence "0x9" is interpreted as hexadecimal 9 [not (0 * 9)==0]. This string is valid: "2x4x0xa" and yields 80 (but it isn't very clear).

Hexadecimal numbers can also be indicated by a trailing "h" or "H". The "h" suffix cannot be used together with a suffix multiplier.

If a SIGUSR1 signal is sent to the process identifier (pid) of a running ddpt utility then the number of blocks copied to that point is output. The copy continues.

Unless the 'status=noxfer' option is given, the elapsed time for the copy plus the throughput measured in megabytes (10**6 bytes) per second is output when the copy is complete (or an error stops the copy). If a SIGUSR1 signal is sent to the process identifier (pid) of a running ddpt utility then the elapsed time and the throughput of the copy to that point is output and the copy continues.

ddpt extra options

The extra options of ddpt (not found in GNU's dd) are:

extra operands in ddpt	default	Brief description
bpt=BPT	varies, 128 when IBS=512	Blocks Per Transfer (BPT) is the number of input blocks per transfer (granularity of each IO) read into the copy buffer. Default varies between 8192 and 1 depending on IBS. If BPT is given as zero, it is changed to the default value. See below this table.
bpt=BPT,OBPC	OBPC=0	Output Blocks Per Check (OBPC) controls the granularity of sparse write, write sparing and trim checks. Default (0) is equivalent to OBPC=(BPTIBS)/OBS. If the given OBPC* exceeds (BPTIBS)/OBS* then it is scaled back to that value.
cdbsz=6 \| 10 \| 12 \| 16 \| 32	10 or 16	cdb size of SCSI READ and/or WRITE commands. Only applicable to pt devices. Defaults to 10 byte cdb unless the largest address exceeds 32 bits or BPT exceeds 16 bits. In either case a 16 cdb is used. Two values can be given, separated by a comma; if so the first value is for IFILE, the second value is for OFILE.
cdl=CDL	0	command duration limits. Either one or two, comma separated, values where 0 means no command duration limits. Values 0 to 7 are permitted and map to 3 bit fields in the SCSI READ(16,32) and WRITE(16,32) commands. If one value is given, it applies to both IFILE and OFILE. If two values are given, the first applies to IFILE and the second applies to OFILE. Command duration limits can be accessed and change via mode pages. See the sdparm utility.
coe=0 \| 1	0	when non-zero, continue_on_error. May use iflag=coe and/or oflag=coe instead. See section on continue on error.
coe_limit=CL	0	number of consecutive "bad" block errors allowed when reading and 'coe > 0'. Default of 0 is interpreted as no limit. See section on continue on error.
ddpt=VERS		causes a syntax error if the ddpt executing the command line (or job file) version number is less than VERS. This operand was introduced in version '0.96'. If VERS starts with 'r' then the check is based on the subversion revision number, the current version number of the ddpt package is 358.
delay=MS,W_MS	0,0	delay (sleep) after each copy segment (typically (BPTIBS)* bytes) by MS milliseconds. 0 implies no delay. Actual write operations may be delayed by W_MS milliseconds
id_usage=LIU	0 or 2	xcopy: set list_id_usage to hold (0), discard (2) or disable (3)
intio=0 \| 1	0	allow read, write and pass-through calls to be interrupted by signals. Default is 0 which means during those calls block SIGINT, SIGPIPE and SIGUSR1(SIGINFO) signals.
iseek=SKIP	0	same as skip=SKIP. From FreeBSD's dd command
ito=ITO	0	odx: inactivity timeout in seconds (0 --> TPC VPD page's default)
list_id=LID	1, 0 or 257	xcopy: list_identifier, a value from 0 to 255 odx: expanded to a 32 unsigned integer oflag=wstream: LID is used as the stream identifier (16 bit value, default: 0)
of2=OFILE2	/dev/null	second output file. Cannot be pt device.
oseek=SEEK	0	same as seek=SEEK
prio=PRIO	1	xcopy: value for priority field
protect=RDP,WRP	0,0	Set the RDPROTECT field in SCSI READs, and the WRPROTECT field in SCSI WRITEs.
retries=RETR	0	number of times to retry an error on a pt device READ or WRITE command
rtf=RTF		odx: ROD Token file, see ddpt_xcopy_odx
rtype=RTYPE	0	odx: ROD type, see ddpt_xcopy_odx
to=TO	0	odx,xcopy: command timeout in seconds (0 --> 600 seconds)
verbose=VERB	0	the larger VERB is then the greater the debug output. 1 and 2 print the cdbs for setup commands; 3 and 4 print the cdbs for all commands
extra ddpt options
--dry-run -d		parse command line operands and options then prepare for the read/copy (e.g. by determining file and device sizes) but bypass the actual read/copy.
--job=JF		JF is a job file containing options. '#' treated as a comment lead-in
--odx -o		odx: request ODX operation, see ddpt_xcopy_odx
--prefetch -P		used in conjunction with --verify option. For each segment sends this sequence of SCSI commands: PRE-FETCH(OFILE, IMMED), READ(IFILE) and VERIFY(OFILE, BYTCHK=1)
--progress, -p
--quiet -q		suppress 'normal' dd like output, making ddpt more like a typical Unix utility in which "no news is good news"
--verbose -v		equivalent to 'verbose=1'. If used twice equivalent to 'verbose=2'. May be shortened to '-v' or '-vv'.
--verify -X		instead of copying IFILE to OFILE, this option causes IFILE and OFILE to be compared. The comparison continues until the count is exhausted or an inequality ("miscompare") is detected. Uses the SCSI VERIFY(BYTCHK=1) command rather than the more common READ(IFILE)+READ(OFILE)+compare approach.
--wscan -w		Windows only. Lists storage devices and associated volumes then exits. Other options ignored.
--xcopy -x		use EXTENDED_COPY rather than READ,WRITE to do the copy
ddpt command line arguments
JF		JF is a job file and its name must not start with '-' or contain a '='. JF is checked to make sure it is regular and contains ASCII characters before being parsed.

Table 4 Extra options found in ddpt

The default values for BPT are: for IBS < 8, BPT is 8192; for IBS < 64, BPT is 1024; for IBS < 1024, BPT is 128; for IBS < 8192, BPT is 16; for IBS < 32768, BPT is 4; else BPT is 1.

If OFILE2 is given then it is written to prior to the write to OFILE including processing such as sparse writing.

Flags

The FLAGS argument of 'iflag=' and 'oflag=' is a comma separated list of items chosen from one or more entries in this table:

FLAG	filetype: pt, blk or reg	iflag or oflag	comments
00		iflag	replaces if=IFILE with as many bytes of 0x0 as are required. Same as giving if=/dev/zero in Unix.
append	reg	oflag	use O_APPEND open flag. Conflicts with 'seek=SEEK' when "SEEK > 0". Pointless on block device, may cause open error odx: open RTF regular file with O_APPEND
atomic	pt	oflag	use WRITE ATOMIC(16) command in place of the usual WRITE command
block	pt	both	open of pt files (devices) typically defaults to non-blocking. This flag will make the open()s blocking
cat	blk, pt	both	xcopy: set cat flag in segment descriptor header
coe	all	iflag, both for pt	See section on continue on error.
dc	blk, pt	both	xcopy: set dc flag in segment descriptor header
direct	blk, reg	both	use O_DIRECT open flag. Bypass block layer's buffering.
dpo	pt	both	"disable page out" set for READ and/or WRITE SCSI commands
errblk	pt	iflag	writes LBAs of bad blocks (medium errors) to errblk.txt file. One LBA per line, in hex, preceded by 0x.
excl	all	both	Use O_EXCL open flag
fdatasync	blk,reg	oflag	flush OFILE's data to storage at end of copy. Ignored if oflag=direct also given.
ff		iflag	replaces if=IFILE and supplies as many bytes of 0xff as are required
flock	all	both	use advisory exclusive lock
force	pt	both	override objections and warnings from sanity checking (e.g. discrepancy between IBS or OBS and the block size in the SCSI READ CAPACITY command response)
fsync	blk,reg	oflag	flush OFILE's data and metadata to storage at end of copy. Ignored if oflag=direct also given.
fua	pt	both	"force unit access" set for READ and/or WRITE SCSI commands
fua_nv	pt	both	"force unit access non-volatile cache" set for READ and/or WRITE SCSI commands
ignoreew	tape	oflag	ignore early warning (of end of tape).
nocache	blk, reg	both	Use posix_fadvise(POSIX_FADV_DONTNEED) to suggest minimal use of file buffers (kernel cache) associated with files being copied.
nocreate	all	oflag	The default action if OFILE does not exist is to create a regular file of that name. This can give unwanted results, for example 'of=/dev/sg7' if there is no device of that name will create a regular file; and that regular file will hide device /dev/sg7 if it does get connected. [The solution is: 'rm /dev/sg7' .] With 'oflag=nocreate' an error will occur if OFILE does not already exist, and no copy (or read) takes place.
no_del_tkn	pt	oflag	odx: see ddpt_xcopy_odx
nofm	tape	oflag	Suppress writing the filemark which is normally written by the st tape driver on closing the tape file
nopad	tape	oflag	when the block to be written to a tape drive contains less than OBS bytes, then this option causes the partial block to be written as is. The default action for a tape in this case is to pad the block.
norcap	pt	both	do not perform SCSI READ CAPACITY command
nowrite	all	oflag	bypass writes to OFILE. Other commands (e.g. related to trim) are sent to OFILE. The "records out" count is not incremented. See section on trim and unmap .
null	all	both	this flag is just a place holder
odx	pt	both	odx: request ODX operation, see ddpt_xcopy_odx
pad	all	oflag	when the block to be written (typically the last block) contains less than OBS bytes, then this option causes the block to be padded with zeros. Default for tapes in to pad, the default for other file types is nopad
prealloc or pre-alloc	reg	oflag	use fallocate() to allocate space for OFILE prior to any data being written. This reduce fragmentation of OFILE.
prefer-rcs	pt	oflag	odx: prefer RECEIVE COPY STATUS command to default RECEIVE ROD TOKEN INFORMATION command
pt	blk	both	access block device via SCSI pass-through mechanism. Has no effect on pt device. For NVMe disks a SNTL is used to translate SCSI commands to the corresponding NVMe (or NVM) commands when this flag is used [Linux only currently].
rarc	pt	iflag	set field of that name in SCSI READ commands
resume	reg	oflag	if copy interrupted add 'resume' to oflag to restart copy
rtf_len	pt	both	odx: see ddpt_xcopy_odx
self	pt	both	specify self trim when used together with trim. Can appear as iflag or oflag but applies to OFILE which needs to be a pt device
sparing	all	oflag	don't write output buffers if reading the OFILE indicates the data compares equal. See section on write sparing .
sparse	all	oflag	don't write output buffers that are full of zeros. The last segment of a regular OFILE is written except when the sparse argument is given twice. See section on sparse writes .
ssync	pt	oflag	send SCSI SYNCHRONIZE CACHE command to OFILE after copy
strunc	reg	oflag	variant of oflag=sparse in which ftruncate() system call is used to extend OFILE if necessary
sync	all	both	use O_SYNC open flag, probably ignored on pt devices
trim	pt	oflag [iflag]	similar functionality to sparse. Sends TRIM (UNMAP or WRITE SAME) command when zeros found. See section on trim and unmap .
trunc	reg	oflag	truncate the OFILE prior to starting the copy. If SEEK is not given or 0, truncate to zero length; else truncate to the length implied by SEEK. The default action of ddpt is to not truncate the OFILE (the opposite of what the dd command does).
unmap	pt	oflag	See trim
verify [,bytchk]	pt	oflag	for pass-through output use the WRITE AND VERIFY command rather than WRITE. If ",bytchk" option given then set field of that name in command.
wstream	pt	oflag	for pass-through output use the WRITE STREAM command rather than WRITE
xcopy	blk, pt	both	if with iflag then EXTENDED COPY sent to IFILE. If with oflag then EXTENDED COPY sent to OFILE

Table 5 Arguments to ddpt's iflag and oflag options

Recent versions of GNU's dd command have these flags with similar semantics as ddpt: 'append', 'direct' and 'sync'.

Conversions

The CONV argument of 'conv=' is a comma separated list of items chosen from one or more entries in this table:

CONV	filetype: pt, blk or reg	comments
fdatasync	blk,reg	see fdatasync flag
fsync	blk,reg	see fsync flag
nocreate	all	equivalent to oflag=nocreate
no_del_tkn	pt	equivalent to oflag=no_del_tkn
noerror	all	IO error does not stop copy. dd's 'conv=noerror,sync' maps to ddpt's 'iflag=coe'. See the coe flag
notrunc	reg	do not truncate OFILE before copy starts (default action)
null	all	this conversion is just a place holder
prefer_rcs	pt	equivalent to oflag=prefer_rcs
resume	reg	see the resume flag
rtf_len	pt	equivalent to "oflag=rtf_len"
sparing	all	see the sparing flag
sparse	all	see the sparse flag
sync	all	this conversion is accepted and ignored.
trunc	reg	see the trunc flag

Table 6 Arguments to ddpt's conv option

The dd command in Unix has been around for a long time. In the early days the 'conv=' option allowed things like ASCII to EBCDIC conversions to take place as part of the copy process. ddpt does not implement these "classical" conversions. More recently, conversions have been added by some dd implementations (e.g. FreeBSD's dd supports 'conv=sparse') that resemble some ddpt features. So in some cases, conversions are accepted and mapped to various flag arguments.

Scatter Gather Lists

The standard skip=<starting_LBA> and equivalent seek= operand of the dd command have been generalized into scatter gather lists in version 0.96 of ddpt. For skip= a gather list is a sequence of [starting_LBA,number_of_blocks] pairs that will be "gathered" up" on the remote storage device and sent back to the host (running the ddpt command) as a linear sequence. For seek= a scatter list is a sequence of [LBA,number_of_blocks] pairs that after a linear sequence of bytes sent from the host will be "scattered" on the remote storage device. Since these two operations are closely related and reciprocal, the collective term of "scatter gather list" (sgl) is used for both.

A little sleight of hand is used here to generalise skip=<starting_LBA> to skip=<sgl> as used by ddpt. A scatter gather list element essentially replaces both the skip=<starting_LBA> and count=<number_of_blocks> operands in the standard dd utility. So the problem is how to accept a standard dd invocation using both skip=<starting_LBA> and count=<number_of_blocks> operands when given to the ddpt utility so that the end result is the same. The solution chosen was to treat skip=<starting_LBA> as a special case and expand it to the sgl element [starting_LBA, 0]. Further when the last sgl element has "0" as its numer_of_blocks then that is interpreted as "to the end of the transfer" where that count is deduced some other way. Apart from this special case (i.e. the sleight of hand) when skip= and seek= operands are given to ddpt, they must have an even number of elements following this pattern: LBA0,NUM0,LBA1,NUM1,LBA2,NUM2 ... etc. Here LBA is a shortening of starting logical block address and NUM is a shortening of number of blocks (from and including the starting page).

If a large sgl is to be input, placing it in a file may be more convenient, then skip=@sgl_filename or seek=H@sgl_in_hex_filename can be given. Following the lead of this utility and others that it shares a library with from sg3_utils, all numbers given are in decimal, unless otherwise indicated. Hex numbers can be indicated by a leading "0x" (the C/C++ language convention) or with a trailing "h" (as used in the t10.org generated standards). Scatter gather lists may have geometric properties (e.g. converting CHS addressed storage to the corresponding LBA addressed storage) which may lead to them being generated by a program and a hex representation may be more convenient. sgl files that are implicitly hexadecimal can be loaded with the seek=H@<filename> syntax. As an extra check, such a sgl file containing implicitly hex numbers must contain the word "hex" (or "HEX") before any sgl elements. This is a sanity check.

sgl elements that have 0 as their number_of_blocks are somewhat curious and termed as degenerate elements in ddpt. No data is copied to (or read from) the storage device based on a degenerate sgl element. In certain contexts they can be viewed metadata. When categorizing sgls the treatment of degenerate elements is problematic. For example should a degenerate sgl element effect the calculation of a sgl's lowest and highest LBAs? Also should a sgl element effect the classification of a sgl being montonic. Ascending montonic means the current LBA is greater than or equal to the sum of the previous element's LBA and its NUM, for all sgl element other than the first. A related definition can be used for descending monotonic scatter gather lists.

Environment variables

The following environment variable modify the behaviour of ddpt when they are defined in the shell tat ddpt is invoked in:

DDPT_DEF_BS
ODX_RTF_LEN
XCOPY_TO_DST
XCOPY_TO_SRC

ddpt default its BS value to 512 bytes which might become tiresome when working with a newer 4096 block size storage environment. Solution (in a bash shell) is to do this:

export DDPT_DEF_BS=4096

The other 3 environment variable are xcopy/odx specific and are explained in the ddpt_xcopy_odx page.

Job file

Job files are modelled on a similar facility in the fio utility. Both ddpt and fio have a lot of command line options that can become burdensome to re-enter if the utility is being executed multiple times. So a job file is simply a file that contains options. Options can be placed on separate lines and anything in a job file starting with "#" to the end of that line is ignored; so "#" can be used as a lead-in to comments. Blanks lines in a job file are ignored.

In ddpt, a job file can be specified as an argument to --job=JF or, more dangerously, directly on the command line. The second variant is "more dangerous" because it needs to distinguish itself from dd style options than contain an equal sign (e.g. iflag=coe ) and common syntax errors. For example:

ddpt if=/dev/zero iflag-coe bs=512 /dev/sde3 seek=20m count=1

will attempt to parse both iflag-coe as a job file (probably because the user meant to type '=' rather than '-') and /dev/sde3 as a job file (where the
user probably meant of=/dev/sde3 ). Various safety checks on potential job files will catch these cases (but not all situations). Most likely a file called iflag-coe doesn't exist, and if /dev/sde3 does exist, it is not a "regular" file. A further check is done looking for non-ASCII characters which should catch pure data or executable files that ddpt is trying to interpret as a job file.

ddpt parses job files (there may be several) when it seems them in a left to right scan of the command line options. Depending on the option, earlier defined options may override, clash with, or accumulate with the same option given in a job file (or later on the command line). For example bpt= options override one another so the last one encountered "wins"; on the other hand if=IFILE and of=OFILE options clash, only one of each is permitted. And options like --verbose (or -v) accumulate. Job files themselves are parsed line by line, from the "top" to the end of the file.

Job files can call other job files including themselves. The depth of the call chain is tracked and when it reaches 5, the parsing stops with an error. This should catch infinite recursion when a job file invokes itself.

File types

Broadly there are three file types: regular files, block devices and block devices accessed via a pass-through interface. In earlier sections these are abbreviated to "reg", "blk" and "pt" respectively. Additionally there are various special files that may also be useful: /dev/null, /dev/zero and /dev/random . Then there are console input, output and error output known in Unix as stdin, stdout and stderr respectively. ddpt (and dd) use stderr for a summary of blocks moved and for warning and error messages. Both stdin and stdout are available for command line piping.

The ddpt utility examines the files it is given and treats them differently depending on their file type. Depending on iflag=FLAGS and oflag=FLAGS settings: O_DIRECT, O_SYNC, O_APPEND, O_EXCL and O_TRUNC flags may be added to the relevant open system call. In Unix see 'man 2 open' or 'man -s 2 open' for more information on the open system call.

File type	open IFILE	open OFILE	IO method	Notes
regular	O_RDONLY	O_WRONLY \| O_CREAT	Unix read() write()	N.B. A regular output file is overwritten (not truncated).
stdin or stdout (or pipe)	[do nothing]	[do nothing]	Unix read() write()	hence open() flags have no effect (e.g. 'oflag=direct' is ignored)
/dev/null or . (period)	O_RDONLY	[do nothing]	Unix read() if input	if output file then nothing is written
block device	O_RDONLY	O_WRONLY \| O_CREAT	Unix read() write()	Windows uses a device specific IO method
pt device	O_RDWR or O_RDONLY	O_RDWR	SCSI commands	Opens input O_RDONLY if O_RDWR fails

Table 6 Treatment of various file types by ddpt

Some of the above combinations are not sensible (e.g. 'oflag=append' on a block device). When either 'iflag=direct' or 'oflag=direct' is given (hence opening the corresponding file with O_DIRECT) the internal copy buffer used is aligned to the page size. For example the page size in the Linux i386 architecture is 4 kilobytes.

Depending on the platform when a file is known to be associated with the pass-through interface (e.g. in Linux /dev/sg* and /dev/bsg/* devices) the "pt" flag is assumed. This implies that in Linux if 'if=/dev/sg2' is specified then there is no need to add 'iflag=pt'. On the other hand if the file appears to be a block device (e.g. in Linux /dev/sdc) then the normal read()/write() system calls will be used unless 'iflag=pt' (or 'oflag=pt') is given.

With block and pt devices the operating system may impose an upper limit on the size of each IO operation. The size that ddpt will attempt to use is IBS*BPT bytes. If this limit is exceeded the operating system may well respond with an EIO (input/output) error. In such cases try reducing the BPT value.

If a partition of block device is accessed (e.g. in Linux /dev/sda2) and the "pt" flag is not given then logical block address 0 for the purposes of ddpt (and its skip and seek options) is the beginning of that partition while the calculated count (e.g. when a 'count' option is not given) is the extent of that partition. However if a partition of a block device is accessed (e.g. in Linux /dev/sda2) when the "pt" flag is active then the partition is ignored and the underlying device (i.e. /dev/sda) is accessed. This means logical block address 0 for the purposes of ddpt (and its skip and seek options) is the beginning of the device (i.e. not the partition) while the calculated count (e.g. when a 'count=' option is not given) is the extent of the whole device.

From ddpt version 0.92 some further checks are made when a block device is accessed with the "pt" flag. If there is a discrepancy between the block device size and the SCSI READ CAPACITY command applied via the pass-through then the copy is aborted unless the "force" flag is given. For example: one would expect the block file size of 'if=/dev/sda' and the READ CAPACITY size of 'if=/dev/sda iflag=pt' to be the same; however the block file size of 'if=/dev/sda2' should be less than the READ CAPACITY size of 'if=/dev/sda2 iflag=pt' causing the copy to be aborted with a warning.

Retries

Often retries are of little use, especially on medium errors, since the device has probably already done multiple retries before the medium error is reported. However a transport error (e.g. causing a CRC error in returned data) is not necessarily seen by the device and a retry may quickly solve the problem. In SAS a Transport Layer Retries (TLR) state machine is optional and requires both the initiator and target to implement the capability. Most first generation SAS disks do not implement TLR. So transport errors in the form of "aborted commands" can be reported due to corruption (e.g. caused by marginal cables) or congestion.

When the retries=RETR option is given and RETR is greater than 0 then most errors on a READ or WRITE SCSI command are retried up to RETR times. Device not ready errors are not retried and "unit attention" conditions are automatically retried (without looking at or decrementing RETR). Once the number of retries is exhausted on the same operation without success then ddpt will refer to the 'coe' option as to what to do next. Each new operation, READ or WRITE, or to a different logical block address has its own retry count initialized to RETR.

Continue on error (coe)

The ddpt utility may be used as a copy "of last resort" on failing media. ddpt will attempt to continue after errors termed as "unrecoverable" or "medium" are encountered. Other types of errors (e.g. lack of permissions on a device or an interrupt signal) will terminate the copy at the point where they are encountered. The general strategy is to replace unrecoverable blocks on the input file with zeros; and to step over (i.e. essentially ignore) medium errors on the output file. In both cases the alignment of the input and output files is maintained.

Continue on error can be invoked several ways: 'iflag=coe', 'oflag=coe', 'coe=1' or 'conv=noerror,sync'. The first variant requests coe on the input file, the second variant requests coe on the output file; the third variant requests coe on both input and output files although it may be ignored on the output file. The last variant is for compatibility with the dd command and requests coe on both input and output files.

The 'coe_limit=CL' option is meant to stop ddpt continuing ad nauseam if unrecoverable errors are being detected continuously on input. The input media may be blank (e.g. unrecorded) or beyond its logical block address limit. The default value of CL is zero and this is interpreted as not having a limit. When coe is active additional counters are maintained for unrecovered read and write errors. If they are non zero at the end of the copy then they are printed out. If either appear then that indicates an imperfect copy.

The actions of coe are slightly different depending on the type of file. So there are two subsections below, the first for coe used with block devices and regular files; the second for coe used with pt (i.e. pass-through) devices. If the reader intends to use coe on pt devices then it is recommended to read both subsections.

Another dd variant called dd_rescue (see www.garloff.de/kurt/linux/ddrescue/ ) has similar "continue on error" facilities.

coe for block devices and regular files

Continue on error (coe) can be requested for input block devices or regular files. It is ignored for output block devices and regular files meaning that any error on such an output file will terminate the copy.

The standard dd command also has a "continue on error" facility. These two invocations are roughly the same:
ddpt if=/dev/sdc iflag=coe of=sdc.img bs=512
dd if=/dev/sdc of=sdc.img bs=512 conv=noerror,sync

Without the 'count=COUNT' option the whole of /dev/sdc will be copied to sdc.img . The 'noerror' argument to 'conv=' tells dd to continue on error and the 'sync' tells it to supply zeros for blocks that can not be read. If the 'sync' is not given then the image file will be shorter when errors are detected. As a convenience ddpt will accept 'conv=noerror,sync' to mean 'iflag=coe '. So the following invocation (followed by its output) is equivalent to the two above:
ddpt if=/dev/sdc of=sdc.img bs=512 conv=noerror,sync
16376+8 records in
16384+0 records out
8 unrecovered read errors
lowest unrecovered read lba=4656, highest unrecovered lba=4663
time to transfer data: 5.822152 secs at 1.44 MB/sec

So 8 unrecovered read errors were detected, starting at LBA 4656 until LBA 4663 inclusive. Like dd, unrecovered read errors are counted as partial reads hence the "16376+8 records in" line. There are a few problems with this: there was actually only one unrecoverable error at LBA 4660 on the device; the other (related) problem is the slow overall read speed. In this case the block layer is accessing the device with a block size of 4 KBytes (while the device has 512 byte blocks). That block size mismatch leads to 7 good 512 byte blocks being "lost" (i.e. LBAs 4656 to 4659 and 4661 to 4663 are valid but will be all zeros in sdc.img). Both dd and ddpt support "iflag=direct" which bypasses the block layer's buffering. So using "iflag=direct" is recommended; the improvement can be seen in the following example:
ddpt if=/dev/sdc iflag=coe,direct of=sdc.img bs=512
16383+1 records in
16384+0 records out
1 unrecovered read error
lowest unrecovered read lba=4660, highest unrecovered lba=4660
time to transfer data: 1.026536 secs at 8.17 MB/sec

Notice that only the unrecoverable LBA (i.e. 4660) has been found this time. Also the overall read speed is faster. Unrecoverable errors on spinning media (as distinct from Solid State Drives (SSDs)) often occur within a range of LBAs, possibly due to physical damage on the media. To help in identifying bad regions the lowest and highest unrecovered LBA are shown at the end of the copy.

coe for pt devices

Continue on error (coe) can be requested for input and output pt devices. Write errors on pt devices are reported and ignored and the ddpt utility continues. [In the case where media errors are causing write errors the user should check the setting of the AWRE bit in the SCSI "read write error recovery" mode page (see SBC-3 at http://www.t10.org).]

When a SCSI READ command detects an unrecoverable read error it responds with a sense key of MEDIUM ERROR or HARDWARE ERROR. Additionally it responds with the logical block address of the first (lowest) block that it failed to read in the current READ command. Any valid blocks prior to the "bad" block may or may not have been transferred. If coe is not given then the ddpt utility will simply terminate at this point (with a reasonable amount of debug information sent to stderr) and good blocks prior to the bad block may not be copied (depending on the setting of BPT). If coe' is active then the first thing ddpt will try to do is a truncated read up to, but not including, the bad block.

There still remain blocks after the "bad" block that need to be fetched. Further bad blocks may be detected and if so the algorithm in the last paragraph is repeated. The result of this process is an imperfect copy with blocks that were read properly placed in the correct relative position in OFILE.

While accessing a block device via a pt is often a "win" for error processing it is not always so. There are lots of "SCSI" devices that have substandard error reporting (e.g. many USB mass storage devices). MEDIUM ERRORs are sometimes reported without the LBA of the lowest block in error or it is misreported. The coe logic can compensate for some known bad error reporting cases (e.g. many CD/DVD/BD players cause problems due to standards (i.e. SPC and MMC) conflict). Error reporting irregularities may lead to the early termination of ddpt even when coe is given. In the author's experience, good error reporting separates the professional products from the rest.

When trying to recover data from a badly flawed device, it is important not to blindly re-read bad blocks as this will absorb a large amount of time. Most modern devices will not report a bad block until they have retried internally many times and using different techniques. Many OS block layers think they are being helpful in adding their own layer of retries. This often leads to very slow copy/recovery performance.

The 'retries=RETR' option is also available to pt devices. Depending on the settings of the SCSI Read-Write error recovery mode page, doing retries on medium errors may or may not be worthwhile. Often a disk has retried reading a block multiple times before reporting a medium error. However there are other sorts of errors (e.g. transport errors) that are often worth retrying. The 'retries=RETR' option logic is applied (and if necessary exhausted) before the coe logic starts its work.

Here is an example following on from those in the previous subsection:
ddpt if=/dev/sdc iflag=coe,pt of=sdc.img bs=512
>> unrecovered read error at blk=4660, substitute zeros
16383+1 records in
16384+0 records out
1 unrecovered read error
lowest unrecovered read lba=4660, highest unrecovered lba=4660
time to transfer data: 0.520079 secs at 16.13 MB/sec

An additional feature of coe on pt devices is that unrecovered blocks are reported as they are detected.

Recovered errors

Often errors are recovered using ECC data or by the device retrying (usually re-reading) the media. Typically at the first sign of trouble, recoverable errors lead to the block in question being reassigned to another location on the media (automatically when the AWRE and ARRE bits are set in the "read write error recovery" mode page). The user may be blissfully unaware that the media may be reaching the end of its useful life. Error counters are maintained in the "Read error counter" and "Write error counter" logs pages which can be viewed with smartctl (from smartmontools) and sg_logs (from the sg3_utils package). Any block that is automatically or manually re-assigned adds a new entry to the "grown" defect list which can be viewed with 'sginfo -G' or 'sg_reassign -g' (both found in the sg3_utils package).

A SCSI storage device can be instructed to report RECOVERED ERRORs by setting the PER bit in the "read write error recover" mode page. Most often this bit is clear. When ddpt detects RECOVERED ERRORs it reports them, counts them and continues the copy. Only the LBA of the last recovered error in a READ or WRITE SCSI command is reported so there could be more than one recovered error per SCSI command. The bpt=1 option could be chosen to limit every SCSI command to a single block transfer (but that would slow things down a fair amount). If the count of recovered errors is greater than zero at the end of the copy then this count is output as well.

There can be other reasons for a sense key of RECOVERED ERROR not directly related to data being currently read or written. SMART alerts (called in SCSI documents "Informational Exceptions") can be conveyed via a RECOVERED ERROR sense key (see the MRIE field in the Informational Exceptions mode page). Such alerts have additional sense codes like "Failure prediction threshold exceeded" and those that contain "impending failure".

Sparse writes

Unix file systems usually allow the file pointer to be moved (e.g. with the lseek() system call) around arbitrarily. Moving the file pointer around can result in "holes" or overwriting of existing data in a file. An output file containing "holes" is what the term sparse writes refers to. When such a sparsely written file is read then each "hole" is interpreted as a sequence of zero bytes. In the case of a regular file, one way of detecting its "sparseness" is to compare the output of the du and ls -l Unix commands.

The concept of sparse writes also is useful for block devices accessed via standard Unix commands. And it is useful for direct access devices accessed via a SCSI command set (such as SBC for disks or MMC for cd/dvds). The underlying assumption here is that the device already has zero bytes in the blocks that are not explicitly written to. A SCSI disk just after the FORMAT command has been successfully performed typically contains zeros in all its blocks. Another way to "zero" a SCSI disk is with a WRITE SAME command (the ATA8 SCT feature set also contains a WRITE SAME command).

When ddpt utility is given 'oflag=sparse' or 'oflag=strunc' flag, it will check each copy buffer fetched from IFILE for zeros. The size of each copy buffer, perhaps apart from the last buffer, is IBS*BPT bytes (see the OBPC below). If the buffer is full of zeros then the corresponding write to OFILE is bypassed.

Bypassing writes of blocks full of zeros can save a lot of IO. However with regular files, bypassed writes at the end of the copy can lead to an OFILE which is shorter than it would have been without sparse writes. This can lead to integrity checking programs like md5sum and sha1sum generating misleading values.

This utility has two ways of handling this file length problem: writing the last block (even if it is full of zeros) or using the ftruncate system call. A third approach is to ignore the problem (i.e. leaving OFILE shorter). The ftruncate approach is used when 'oflag=strunc' is given, while the last block is written when 'oflag=sparse'. To ignore the file length issue use 'oflag=sparse,sparse'. Note that if OFILE's length is already correct or longer than required, no action is taken.

The support for sparse writing of regular files may depend on the OS, the file system and the settings of OFILE. POSIX makes few guarantees when the ftruncate system call is used to extend a file's length, as may occur when 'oflag=strunc'. Further, primitive file systems like VFAT may not accept sparse writes or simulate the effect by writing blocks of zeros. The latter approach will defeat any sparse writing performance gain.

A sparse file may also be created by ddpt by using the 'seek=SEEK' option. Here is an example:

$ ddpt if=/dev/zero of=t seek=1m bs=1024 count=1
1+0 records in
1+0 records out
time to transfer data: 0.000156 secs at 6.56 MB/sec
$ ls -lh t
-rw-rw-r-- 1 fred fred 1.1G 2007-06-28 22:15 t
$ du -h t
12K t

The above shows that even though the file system knows the sparse file is (logically) 1.1 GB long, it only consumes 12 KB of space within the file system. In the above case, ddpt is producing the same result as the standard dd command. Programs that calculate checksums such as md5sum and sha1sum should give the same result when applied to either a sparse file or the corresponding non-sparse file.

For speed, it is best to have BPT at its default value or larger. However doing sparse write checks for zeros in units of IBS*BPT bytes may be too large, missing the chance to bypass many writes. The OBPC option allows the granularity of the check buffer to be reduced, the minimum being one OBS [the output (logical) block size]. Values of OBPC that imply a check buffer larger than IBS*BPT bytes are rounded back to OBPC=(BPT*IBS)/OBS . The default value of OBPC (0) also uses a check buffer of IBS*BPT bytes.

Write sparing

Write sparing is most useful when a significant proportion of the data to be written is expected to be identical to the data already there, and where writing is slower than reading, or where write endurance is limited (e.g. SSD, USB flash drive or memory card). For example, suppose you have a bootable USB flash drive which you regularly back up to an image file on your SSD. Using write sparing will greatly reduce the amount of data that needs to be written, since most data will match that in the already-existing previous image file.

With write sparing, after reading the IFILE, the corresponding segment in the OFILE is read into a second buffer and the two buffers are compared. If unequal, the write of the original buffer to OFILE takes place as normal. If equal then the write to OFILE is bypassed. The OFILE should exist and be readable and seek-able (hence stdout is not appropriate). OFILE's length may be shorter than that of IFILE.

It seems unlikely that it would be useful to have both sparse writes and write sparing active on the same OFILE. If they are both given (i.e. 'oflag=sparing,sparse') then sparse writes are checked first and if zeros are found, the check for write sparing is bypassed on that segment.

Trim and unmap

This is a storage feature often associated with Solid State Disks (SSDs) or disk arrays with "thin provisioning". In the ATA command set (ACS-2) the relevant command is DATA SET MANAGEMENT with the TRIM bit set. In the SCSI command set (SBC-4) it is either the UNMAP or WRITE SAME commands. Note there is no TRIM command however this feature has been christened "trim" by the technical press.

Trim is a way of telling a storage device that blocks are no longer needed. Keeping the pool of unwritten blocks large is important for the write performance of SSDs and the thrifty use of real storage in thin provisioned arrays. Currently file systems in recent OSes may issue trims associated with file deletes. The trim option in ddpt may be useful when a partition or a whole SSD is to be "deleted". Note that ddpt is bypassing file systems in that it only offers trim on pass-through (pt) devices

This utility issues SCSI commands to pt devices and for "trim" currently issues a SCSI WRITE SAME(16) command with the UNMAP bit set. If the pt device is a SSD with a ATA interface then recent versions of Linux will translate the SCSI WRITE SAME command to the ATA DATA SET MANAGEMENT command with the TRIM bit set. The maximum size of each "trim" command sent is the size of the copy buffer (i.e. IBS * BPT bytes). And that maximum can be reduced with the OBPC argument of the 'bpt=' option.

Verbose

In the Unix style, ddpt doesn't output anything (to stderr) during large IO transfers. To get a progress report the SIGUSR1 signal can be sent to the ddpt process. In the Unix dd command style, ddpt outputs two lines on completion that show the number of full and partial records in (on the first line) and out (on the second line).

ddpt has a 'verbose=' option whose default value is zero. When set to these values 'verbose=' has the following actions:

show categorization and INQUIRY data (where applicable) for the input and output files. For files, other than streams, the file/device size (and device block size) are output.
same output as 1 plus data for Unix and SCSI commands (cdbs) that are not repeated (i.e. other than Unix read/write and SCSI READ/WRITE commands). Increased error reporting for all SCSI commands
same output as 2 plus data for Unix and SCSI commands (cdbs) that are repeated. For a large copy this will be a lot of output.
maximum amount of debug output. For a large copy this will be a lot of output.

All verbose output is sent to stderr so that ddpt with "of=-" (copy output to stdout) is not corrupted.

Following is an example of using verbose=1 to find information about /dev/sda . If no copy is required then setting count=0 will see to that. Since /dev/sda is a block device then it would normally be accessed via Unix system commands. The verbose=1 output is relatively short to non pt devices. The second invocation is with 'iflag=pt' and more is output. That includes INQUIRY standard response data (e.g. "SEAGATE ..." line). See the SBC-2 drafts at www.t10.org for more information.

$ ddpt if=/dev/sda bs=512 verbose=1 count=0
>> Input file type: block device
open input, flags=0x0
>> Output file type: null device
/dev/sda [blk]: blocks=625142448 [0x2542eab0], block_size=512, 320 GB approx
skip=0 (blocks on input), seek=0 (blocks on output)
initial count=0 (blocks of input), blocks_per_transfer=128
0+0 records in
0+0 records out
time to read data: 0.000028 secs

# ddpt if=/dev/sdb iflag=pt bs=512 verbose=1 count=0
>> Input file type: pass-through [pt] device block device
/dev/sdb: Linux scsi_debug 0004 [pdt=0]
>> Output file type: null device
/dev/sdb [pt]: blocks=16384 [0x4000], block_size=512, 8 MiB approx
skip=0 (blocks on input), seek=0 (blocks on output)
initial count=0 (blocks of input), blocks_per_transfer=128
0+0 records in
0+0 records out
time to read data: 0.000031 secs

As an experimental feature setting 'verbose=-1' will map stderr to /dev/null so that no debug messages and copy summary will appear.

Disk partitions

It may be useful to copy a partition on a disk. To do this the partition table may need to be read, preferably in units that are useful for ddpt. Following is an example of the GNU parted utility where "unit s" means in units of the logical block size (e.g. 512 bytes):

# parted /dev/sda unit s print
Model: ATA FUJITSU MHY2160B (scsi)
Disk /dev/sda: 312581808s
Sector size (logical/physical): 512B/512B
Partition Table: msdos

Number Start       End         Size        Type      File system     Flags
1      2048s       14280703s   14278656s   primary   ntfs
2      14280704s   156299263s 142018560s primary   ntfs            boot
3      156310560s 312575759s 156265200s extended
5      156310623s 310700879s 154390257s logical   ext3
6      310700943s 312575759s 1874817s    logical   linux-swap(v1)

Assume we want to copy the whole "2" partition to a file, that could be done a few ways:

ddpt if=/dev/sda2 of=/tmp/a.bin bs=512
ddpt if=/dev/sda skip=14280704 of=/tmp/b.bin bs=512 count=142018560
ddpt if=/dev/sda3 iflag=pt skip=14280704 of=/tmp/c.bin bs=512 count=142018560

So if the /dev/sda2 is named then 'skip=' and 'count=' are not needed unless 'iflag=pt' is given. Once 'iflag=pt' is given any variants of /dev/sda (e.g. /dev/sda1 /dev/sda2 /dev/sda3 etc) map back to /dev/sda because SCSI commands are not partition aware. From ddpt version 0.92 the third case (i.e. 'if=/dev/sda3 iflag=pt') would be aborted with a warning; to override that 'iflag=pt,force' would be required.

As a double check the ddpt 'verbose=1 count=0' test will show the size of a what is being considered:

# ddpt if=/dev/sda2 bs=512 verbose=1 count=0
>> Input file type: block device
        open input, flags=0x0
>> Output file type: null device
/dev/sda2 [blk]: blocks=142018560 [0x8770800], block_size=512, 72 GB approx
...

# ddpt if=/dev/sda2 iflag=pt,force bs=512 verbose=1 count=0
>> Input file type: pass-through [pt] block device
    /dev/sda2: ATA       FUJITSU MHY2160B 0000 [pdt=0]
>> Output file type: null device
/dev/sda2 [pt]: blocks=312581808 [0x12a19eb0], block_size=512, 160 GB approx
...

Partition table information can also be obtained with the fdisk utility. The output is a little more messy than parted for the same 160 GB disk. Note the partition size is in the "Blocks" column and seems to be in 1024 byte units:

# fdisk -ul /dev/sda

Disk /dev/sda: 160.0 GB, 160041885696 bytes
255 heads, 63 sectors/track, 19457 cylinders, total 312581808 sectors
Units = sectors of 1 * 512 = 512 bytes
Disk identifier: 0x79cbdc8f

   Device Boot      Start         End      Blocks   Id System
/dev/sda1            2048    14280703     7139328   27 Unknown
Partition 1 does not end on cylinder boundary.
/dev/sda2   *    14280704   156299263    71009280    7 HPFS/NTFS
Partition 2 does not end on cylinder boundary.
/dev/sda3       156310560   312575759    78132600    5 Extended
Partition 3 does not end on cylinder boundary.
/dev/sda5       156310623   310700879    77195128+ 83 Linux
/dev/sda6       310700943   312575759      937408+ 82 Linux swap / Solaris

When copying a partition to a file, a lot of storage may be required. Using 'oflag=sparse' may save space. Copying a small amount first and checking the OFILE with a utility like 'hexdump -C' may confirm that the full copy will be worthwhile.

xcopy and odx

XCOPY is a common shortening for the EXTENDED COPY facility introduced in SPC-2 (ANSI INCITS 351-2001). EXTENDED COPY was enhanced in SPC-3 and now as SPC-4 approaches standardization the facility has become a lot larger . The original facility introduced in SPC-2 is now called EXTENDED COPY(LID1) where "LID1" means it has a List IDentifier length of 1 byte. There is now a "LID4" variant (with 4 byte list identifiers) with added flexibility and complexity. A subset of EXTENDED COPY(LID4) that supports token based didk to disk copies was proposed with the name "xcopy version 2, lite" and accepted. It is based on two new SBC-3 commands: POPULATE TOKEN and WRITE USING TOKEN. Microsoft has integrated this subset into its servers and given it the name: Offloaded Data Xfer (ODX).

Individual SCSI and ATA disks do not typically support xcopy; disk arrays and iSCSI servers do. Support for xcopy(LID1) has been added to the Linux target subsystem. There is a 3PC field in a standard SCSI INQUIRY response that when set indicates a "logical unit" (LU: an abstraction of a disk) supports at least some xcopy functionality. If the 3PC field is not set in both the source LU and the destination LU then a xcopy operation is extremely like to fail.

Support for disk to disk copies using EXTENDED COPY(LID1) was added to the sg3_utils package with the sg_xcopy utility (and its companion: sg_copy_results). That sg_xcopy functionality was ported into ddpt in version 0.93 and is referred to in these pages as "xcopy". In version 0.94 of ddpt ODX support has been added and is referred to in these pages as "odx". A new companion utility called ddptctl adds some extra odx functionality such as listing and decoding ROD Tokens and the ability to abort a copy in progress.

See ddpt_xcopy_odx for more information.

Tape

On Linux systems, ddpt can also work with tape drives via the "st" SCSI tape driver. On Debian-based distributions, it is suggested that you install the mt-st package, which provides a more fully-featured version of the "mt" tape control program (see 'man mt' for more details).

Tape drives can operate in fixed- or variable-length block modes. In variable-block mode, each write to the tape writes a single block of that size. In fixed-block mode, each write to the tape must be a multiple of the previously-selected block size. The block size/mode can be set with the mt command prior to invoking ddpt. For example:
# mt -f /dev/nst0 setblk 0
sets variable-block mode, and
# mt -f /dev/nst0 setblk 32768
sets fixed-block mode with block size 32768 bytes.

Note that some tape drives support only fixed-block mode, and possibly even only one block size. (For example, QIC-150 tapes use a fixed block size of 512 bytes.) There may also be restrictions on the block size, e.g. it may have to be even.

When using ddpt to write to tape, if the final read from the input is less than OBS, it is padded to OBS bytes before writing to tape to ensure that all blocks of the tape file are the same length. Having a shorter final block would fail if the drive is in fixed-block mode, and could create interchange problems. It is common to expect all blocks in a file on tape to be the same length. However, to tell ddpt to not pad the final block, use oflag=nopad .

The st tape driver normally writes a filemark when the file (/dev/nst0 etc.) is closed. If you prefer to not have the filemark written, use oflag=nofm . One use case for that might be if using ddpt several times in succession to append more data to the same file on tape. In that case you will probably want to ensure that a filemark gets written at the end. So either omit oflag=nofm on the last ddpt invocation, or manually write a filemark using mt after ddpt exits:
# mt -f /dev/nst0 weof 1

For reading from an unknown tape where you don't know which block size(s) were used, you can read in variable-block mode specifying a large IBS. The st driver returns a smaller amount of data if the size of the block read is smaller. Thus a command like:
# ddpt if=/dev/nst0 of=output.bin bs=262144
should read the file from tape regardless of the block size used (assuming no blocks are larger than 256KB). You can use the verbose option to have ddpt tell you what the actual block size(s) is.

Tape users may be interested in this virtual tape library project: mhvtl .

ddptctl and ddpt_sgl

These are helper utilities for ddpt. Both have standard Unix option syntax. So both have long options starting with "--" and short options starting with a single "-".

Examples

Most of these examples use Linux device names. See the device naming page for appropriate device names in other supported operating systems.

To start with, read 1024 blocks, each of 512 bytes, from a block device. Notice there is no 'of=<OFILE>' argument so the output goes to /dev/null (i.e. it gets thrown away). [Beware: the dd command defaults to sending the output to stdout (often making a mess on the screen)].

# ddpt if=/dev/sda bs=512 count=1k
Output file not specified so no copy, just reading input
1024+0 records in
0+0 records out
time to read data: 0.013480 secs at 38.89 MB/sec

Now to access the same device (assuming /dev/sda and /dev/sg0 refer to that device) via the pass-through interface use either:

# ddpt if=/dev/sda iflag=pt bs=512 count=1k
Output file not specified so no copy, just reading input
1024+0 records in
0+0 records out
time to read data: 0.005916 secs at 88.62 MB/sec

# ddpt if=/dev/sg0 bs=512 count=1k
Output file not specified so no copy, just reading input
1024+0 records in
0+0 records out
time to read data: 0.005945 secs at 88.19 MB/sec

The first form needs the 'iflag=pt' option because given a device name that can be accessed via either a block device interface or a pass-through interface, ddpt will default to using the block device interface. In the second form the /dev/sg0 device only supports the pass-through interface.

To copy from a block device to a file:

# ddpt if=/dev/sdb of=t.img bs=512 count=64
64+0 records in
64+0 records out
time to transfer data: 0.179983 secs at 0.18 MB/sec

This copies 32 KB from the beginning of /dev/sdb to the file t.img .

Now a bit more ambitious: to copy from one block device to another. Beware that writing to a block device is an irreversible operation so take care. To have a closer look at what might happen use the combination of 'count=0 verbose=2' to check things:

# ddpt if=/dev/sda of=/dev/sdb oflag=pt bs=512 count=0 verbose=2
>> Input file type: block device
        open input, flags=0x0
>> Output file type: block device
open /dev/sdb with flags=0x802
    inquiry cdb: 12 00 00 00 24 00
    /dev/sdb: Linux     scsi_debug        0004 [pdt=0]
/dev/sda [blk]: blocks=625142448 [0x2542eab0], block_size=512, 320 GB approx
    read capacity (10) cdb: 25 00 00 00 00 00 00 00 00 00
/dev/sdb [pt]: blocks=2048 [0x800], block_size=4096, 8 MiB approx
>> warning: /dev/sdb block size confusion: obs=512, device claims=4096
skip=0 (blocks on input), seek=0 (blocks on output)
ibs=512 bytes, obs=512 bytes, OBPC=0
initial count=0 (blocks of input), blocks_per_transfer=128
0+0 records in
0+0 records out
time to transfer data: 0.000034 secs

To add some variation, the 'pt' option was selected on the output block device. It is not necessary to understand all the details. The main points are that the 'count=0' makes sure no data is actually written so no damage is done. The 'count=0' argument causes the size of the disks (in blocks) and the logical block size of the disks to be examined. This highlights a problem noted on the line starting with ">> warning:" and that is that the logical block size of /dev/sdb is 4096 bytes. Since 'obs=OBS' has not been specified then OBS is assumed to be the same as BS which is 512 bytes. So the invocation should have been:

# ddpt if=/dev/sda ibs=512 of=/dev/sdb oflag=pt obs=4096 count=0 verbose=2

Grouping the arguments helps as does using 'ibs=512' rather than 'bs=512' in this case. Note that the count argument, if given, is in units of IBS. If we do the copy then the size of the smaller disk dictates the number of blocks moved (if the count argument is not given or is too large):

# ddpt if=/dev/sdb ibs=512 of=/dev/sdc oflag=pt obs=4096
3477504+0 records in
434688+0 records out
time to transfer data: 108.931869 secs at 16.34 MB/sec

In this case /dev/sdb was the smaller (about 1.8 GB). There are eight times as many "records in" than "records out" reflecting the different block sizes.
In the next case the size of /dev/sdc becomes the limiting factor due to the seek=2097000 option:

# ddpt if=/dev/sdb ibs=512 of=/dev/sdc oflag=pt obs=4096 seek=2097000
1216+0 records in
152+0 records out
time to transfer data: 0.063875 secs at 9.75 MB/sec

The sparse writing flag can be used to count the number of blocks containing zeros. Here a lightly used partition is being checked 2 GB after its start and 5 GB is being checked:

# ddpt if=/dev/sda7 skip=4m bs=512 oflag=sparse count=10m
Output file not specified so no copy, just reading input
10485760+0 records in
0+0 records out
8895616 bypassed records out
time to read data: 138.777383 secs at 38.69 MB/sec

Actually a copy buffer (128 blocks in this case) is being checked for zeros so not all of the blocks containing all zeros is counted. A more accurate count can be made by setting OBPC to 1. However the execution time suffers slightly:

# ddpt if=/dev/sda7 skip=4m bs=512 oflag=sparse count=10m bpt=128,1
Output file not specified so no copy, just reading input
10485760+0 records in
0+0 records out
8908836 bypassed records out
time to read data: 136.614629 secs at 39.30 MB/sec

A job file called read_check.jf might contain this:

# Read given IFILE as pt device and check, continue on error iflag=pt,coe # access as pt device (may fail depending on OS) bs=512 verbose=1    # could be increased for more noise

and be used like this:

ddpt --job=read_check.jf if=/dev/sdc

Note that /dev/sdc is used as a pass-through device which will yield better (or at least lower level) error reporting if a problem is found.

There are also some examples in the ddpt man page and the doc/ddpt_examples.txt file in the distribution tarball.

Downloads

The tarball contains the source and can be built with a './configure ; make ; make install' sequence. In some cases executing the './autogen.sh' script prior to './configure' may be required.

Table 7. ddpt tarballs and packages

ddpt version release date	tarball Windows exes [32, 64 bit]	i386 rpm binary	debian package
0.91 20100922	ddpt-0.91.tgz ddpt-0.91.tar.bz2 ddpt.exe	ddpt-0.91-1.i386.rpm	ddpt_0.91-0.1_i386.deb
0.92 20110217	ddpt-0.92.tgz ddpt-0.92.tar.gz ddpt-0.92.tar.bz2 ddpt.exe	ddpt-0.92-1.i386.rpm	ddpt_0.92-0.1_i386.deb
0.93 20131113	ddpt-0.93.tgz ddpt-0.93.tar.xz	ddpt-0.93-1.i386.rpm ddpt-0.93-1.x86_64.rpm	ddpt_0.93-0.1_i386.deb ddpt_0.93-0.1_amd64.deb
0.94 20140407	ddpt-0.94.tgz ddpt-0.94.tar.xz ddpt-0.94exe.zip	ddpt-0.94-1.i386.rpm ddpt-0.94-1.x86_64.rpm	ddpt_0.94-0.1_i386.deb ddpt_0.94-0.1_amd64.deb
0.95 20141227	ddpt-0.95.tgz , ddpt-0.95.tar.xz ddpt.exe , ddpt_64.exe ddptctl.exe , ddptctl_64.exe	ddpt-0.95-1.i386.rpm ddpt-0.95-1.x86_64.rpm	ddpt_0.95-0.1_i386.deb ddpt_0.95-0.1_amd64.deb
0.96 20200303	ddpt-0.96.tgz , ddpt-0.96.tar.xz	ddpt-0.96-1.x86_64.rpm	ddpt_0.96-0.1_amd64.deb
0.97 20210421	ddpt-0.97.tgz , ddpt-0.97.tar.xz	ddpt-0.97-1.x86_64.rpm	ddpt_0.97-0.1_i386.deb ddpt_0.97-0.1_amd64.deb

The Windows executable was made in a MinGW environment. Here is the most recent ChangeLog and Unix style manpages for ddpt, ddptctl and ddpt_sgl in html.

This utility shares code with the sg3_utils package, specifically a library called libsgutils. If available during the build, the libsgutils library will be used at runtime. If the library is not detected during the built, the required code is built into the executable (making it slightly larger). For ease of use, the binary packages in table 7 do not depend on libsgutils but distributions (e.g. Red Hat and Debian) prefer to factor out common code.

References

The Open Group Base Specifications Issue 7 (also known as SUSv4) have a useful definition of the basic Unix dd command: see http://www.opengroup.org/onlinepubs/9699919799 and select "Shell & Utilities" (on the left) then select "Utilities" (on the lower left), and finally select "dd" (from the list in the lower left).

When a pass-through interface is used, the ddpt utility issues SCSI commands that are defined in SPC-4 (primary commands), SBC-3 (commands for direct access devices (e.g. disks)) and MMC-5 (commands for CD/DVD devices). These SCSI command sets can be found at www.t10.org . When the storage device is an ATA disk (e.g. a SATA disk) a SCSI to ATA Translation layer (SATL compliant with SAT or SAT-2) is assumed.

Return to main page.

Last updated: 22nd April 2021