OVERVIEW-FPP.txt
----------------

Note:	this document is written in ordinary ASCII text with tabs set
		to a width of 4.


Overview of Full Feature Phase (FFP)
------------------------------------

There are 5 steps involved in using the iscsi initiator (the first 4 must be
done as root):

1.	Load the "iscsi_initiator.o" module.
2.	Use "../common/iscsi_manage" to set up parameters for negotiation.
3.	Use "iscsi_config up" to trigger the login process and the discovery
	process that follows it.
4.	Use "mount /mnt/scsi" to mount the scsi target device as a disk.
5.	Use ordinary operations to create/delete/read/write files on the disk.


5.0	Once the scsi target device has been mounted, the file system it contains
	can be used by ordinary users in the way they would use any other file
	system on any other disk.  At the command level, this means that the
	user can use any of the normal file manipulation commands: cp, mv, rm,
	tar, mkdir, rmdir, cd, pwd, etc., on files and directories stored on the
	scsi target disk.   Furthermore, any programs can access these files
	using the normal open, close, read, write, etc. system calls.

	For example, if the file system is mounted using the command
		"mount /mnt/scsi"
	and I have a directory on this file system called "rdr", then I can do
	the following types of operations:
		cd /mnt/scsi/rdr		# my local directory is now on that file system
		pwd						# shows me that
		ls						# lists all files in that directory
		vi xyz.c				# edits the file xyz.c in that directory
		...
		make xyz				# builds an executable in that directory
		xyz						# runs the executable xyz from that directory
		...


	During these operations, the Linux SCSI mid-level delivers only READ (10)
	and WRITE (10) commands to the iscsi_initiator module, so these are the
	only commands we will look at here.


5.1	SCSI READ

	A SCSI READ operation is used to transfer data from the target to the
	initiator.  This means it is a "read" operation from the point of view
	of the initiator, who "reads" data from the target.  From the target's
	point of view, a READ operation causes it to "read" data from the target's
	disk, but to "send" (i.e., "write") data to the inititator.


5.1.1	Overview

	To simplify this discussion, we will initially assume that "phase collapse"
	is not being used.  Later we will explain what happens when
	"phase collapse" is used.  We will also ignore for the moment the effects
	of the MaxBurstSize key on data being read.  This will also be explained
	later.

	A "task" is a SCSI operation.  In principle, one task can require
	several (linked) SCSI commands, and each SCSI command can involve the
	transfer of several iSCSI PDUs.  However, the Linux SCSI system does not
	seem to use multi-command tasks, so we only see one SCSI command per task.
	(In any case, SCSI allows only one command per task to be outstanding at
	any time.)  This one command, however, will usually require the transfer
	of several iSCSI PDUs -- for a READ command, these PDUs will be:

	1.	one "SCSI Command" PDU sent by the initiator to deliver the READ
		command information to the target,

	2.	some number of "DataIn" PDUs sent back by the target to deliver the
		data to the initiator,

	3.	one final "SCSI Response" PDU sent back by the target to deliver the
		final status to the initiator.

	To account for these various levels, the iSCSI protocol requires that
	several "identifiers" and "counters" be maintained by the initiator
	and the target:

	1.	The Initiator Task Tag (ITT).  This is a value determined by the
		initiator that uniquely identifies each outstanding iSCSI task for the
		life of that task.  In the case of a READ operation, the same ITT value
		is carried in all the PDUs mentioned above (the "SCSI Command" PDU, all
		the "DataIn" PDUs, and the final "SCSI Response" PDU), because together
		they are all part of the same task.  Our iscsi_initiator implements the
		ITT as a per-session counter, called "init_task_tag", in the session
		data structure.  It is incremented by 1 on each call to the
		"iscsi_initiator_queuecommand" function.  Note, however, that the iSCSI
		protocol does not require this to be a sequential counter!  That is
		just the simplest way to implement it so that no two tasks will have
		the same ITT.  The target should not check this value for
		sequentiality, and should just copy it opaquely into all PDUs it sends
		back that are related to this task in order to identify their
		relationship to the initiator.

	2.	The Command Sequence Number (CmdSN).  This is a sequence number, and
		must be implemented by the initiator as a session-wide counter that is
		incremented by 1 for each command sent in that session, regardless of
		which task the command is associated with.  This value is contained
		only in PDUs sent by the initiator.  The target should check this value
		for sequentiality, and should use it to detect duplicate and/or missing
		PDUs.

	3.	The Expected Command Sequence Number (ExpCmdSN).  This is a value that
		is maintained by the target in order to check the value of the CmdSN
		field in incoming PDUs from the initiator.  If the incoming CmdSN value
		is less than the target's ExpCmdSN, the incoming PDU is a duplicate of
		a PDU already received by the target and should be ignored.  If the
		incoming CmdSN value is greater than the target's ExpCmdSN, the
		incoming PDU is out of order, indicating that the PDU containing
		the proper (i.e., expected) CmdSN may have been lost.  If the incoming
		CmdSN value is equal to the target's ExpCmdSN (the normal case), then
		the incoming PDU is in the proper order, and the target increments its
		ExpCmdSN by 1.  Whenever the target sends a PDU of any kind back to the
		initiator, it will include in that PDU a copy of its current ExpCmdSN
		value.  This "acknowledges" to the initiator that the target has
		received all command PDUs with CmdSN values less than this ExpCmdSN.

	4.	The Data Sequence Number (DataSN).  This is a sequence number, and must
		be implemented by the target as a command-dependent counter that is
		incremented by 1 for each DataIn PDU it sends back for this command.
		The first DataIn PDU for each command contains a DataSN of 0.  The
		initiator should check this value for sequentiality, and should use it
		to check for duplicate and/or missing DataIn PDUs from the target.

	5.	The Status Sequence Number (StatSN).  This is a sequence number, and
		must be implemented by the target as a connection-wide counter that is
		incremented by 1 for each response PDU it sends back on this
		connection.  It is only sent by the target.  The initiator should check
		this value for sequentiality, and should use it to check for duplicate
		and/or missing response PDUs from the target.  Note that after a "SCSI
		Command" PDU for a READ operation, the one or more DataIn PDUs sent by
		the target do not contain a StatSN (or rather, the StatSN field in the
		PDU is "reserved" and therefore contains a value of 0).  Only the one
		"SCSI Response" PDU sent after all the DataIn PDUs contains a valid
		StatSN (and advances the StatSN by 1).

	6.	The Expected Status Sequence Number (ExpStatSN).  This is a value that
		is maintained by the initiator in order to check the value of the
		StatSN field in incoming PDUs from the target.  If the incoming StatSN
		value is less than the initiator's ExpStatSN, the incoming PDU is a
		duplicate of a PDU already received by the initiator and should be
		ignored.  If the incoming StatSN value is greater than the initiator's
		ExpStatSN, the incoming PDU is out of order, indicating that the PDU
		containing the proper (i.e., expected) StatSN may have been lost.  If
		the incoming StatSN value is equal to the initiator's ExpStatSN (the
		normal case), then the incoming PDU is in the proper order, and the
		initiator increments its ExpStatSN by 1.  Whenever the initiator sends
		a command PDU of any kind to the target, it will include in that PDU a
		copy of its current ExpStatSN value.  This "acknowledges" to the target
		that the initiator has received all response PDUs with StatSN values
		less than this ExpStatSN.


5.1.2	PDU trace

	The following gives a trace of the PDUs sent during a READ command,
	and the values of the counters maintained by the initiator and the
	target during the exchange of PDUs. (This is taken directly from the
	file "ty4" starting at line 4510.)

	0.	Assumptions:
		1.	The command is a READ of 32768 bytes of data (64 blocks each
			containing 512 bytes) starting at logical block number 18572 on the
			disk (and ending with logical block number 18635).
		2.	The current value of the initiator's CmdSN counter is 22264.
		3.	The target has received and processed completely all previous
			commands from the initiator.
		4.	The current value of the target's StatSN counter is 67.
		5.	The initiator has received and processed completely all previous
			responses from the target.
		6.	The key MaxRecvPDULength=12288 was sent by the initiator to the
			target during the login phase (which says that the target should
			send no more that 12288 bytes of data (24 logical blocks) in any
			one PDU).  This value of MaxRecvPDULength means that the target
			will be forced to send the total amount of data (32768 bytes) in
			3 separate DataIn PDUs having DataSN numbers 0, 1, 2:
				0.	the first 12288 bytes (a total of 12288 sent so far);
				1.	the second 12288 bytes (a total of 24576 sent so far);
				2.	the last 8192 bytes (a total of 32768 sent so far).
		7.	Header and Data Digests are not in use.
		8.	DataPDUInOrder=yes.

	1.	The initiator sends a 48-byte SCSICommand PDU to the target:
		1.	The opcode field is 0x01 for (SCSICommand).
		2.	The F bit is 1 and the R bit is 1 (for a READ).
		3.	The DataSegmentLength (DSL) field is 0 (no data attached to this
			PDU).
		4.	The ITT field is 88931 (unique for this command).
		5.	The EDTL field is 32768 (the total number of bytes to READ).
		6.	The CmdSN field is 22264 (the initiator's CmdSN counter).
		7.	The ExpStatSN field is 67 (the initiator's ExpStatSN counter).
		8.	The CDBopcode field is 0x28 (READ (10)).
		9.	The CDBlba field is 18572 (starting logical block number on disk).
		10.	The CDBlength field is 64 (total number of blocks to READ).

	2.	The target sends a 12336 byte DataIn PDU that contains 48 header bytes
		and the 12288 bytes from the 24 logical blocks 18572 through 18595
		inclusive on the disk.  (12288 bytes (24 logical blocks) is the maximum
		amount of data allowed in any PDU sent by the target to this initiator,
		as controlled by the MaxRecvPDULength=12288 key negotiated during
		login).
		1.	The opcode field is 0x25 (for DataIn).
		2.	The I bit is 1 (for a response PDU).
		3.	The F bit is 0 (not the last DataIn PDU in this sequence).
		4.	The DSL field is 12288 (this many bytes of data are attached to
			this PDU (24 blocks of 512 bytes each).
		5.	The ITT field is 88931 (copied from the SCSICommand PDU).
		6.	The ExpCmdSN field is 22265 (1 more than the CmdSN field in the
			SCSICommand PDU, thereby acknowledging receipt of that command).
		7.	The DataSN field is 0 (first DataIn PDU sent in response to the
			SCSICommand request).
		8.	The BufferOffset field is 0 (the attached data goes into the first
			byte in the memory buffer on the initiator).

	3.	The target sends a 12336 byte DataIn PDU that contains 48 header bytes
		and the 12288 bytes from 24 logical blocks 18596 through 18619
		inclusive on the disk.
		1.	The opcode field is 0x25 (for DataIn).
		2.	The I bit is 1 (for a response PDU).
		3.	The F bit is 0 (not the last DataIn PDU in this sequence).
		4.	The DSL field is 12288 (this many bytes of data are attached to
			this PDU (24 blocks of 512 bytes each).
		5.	The ITT field is 88931 (copied from the SCSICommand PDU).
		6.	The ExpCmdSN field is 22265 (1 more than the CmdSN field in the
			SCSICommand PDU, again acknowledging receipt of that command).
		7.	The DataSN field is 1 (second DataIn PDU sent in response to the
			SCSICommand request).
		8.	The BufferOffset field is 12288 (the attached data goes into this
			byte in the memory buffer on the initiator, immediately after the
			last byte sent in the first DataIn PDU).

	4.	The target sends a 8240 byte DataIn PDU that contains 48 header bytes
		and the 8192 bytes from 16 logical blocks 18620 through 18635 inclusive
		on the disk.
		1.	The opcode field is 0x25 (for DataIn).
		2.	The I bit is 1 (for a response PDU).
		3.	The F bit is 1 (this is the last DataIn PDU in this sequence).
		4.	The DSL field is 8192 (this many bytes of data are attached to
			this PDU (16 blocks of 512 bytes each).
		5.	The ITT field is 88931 (copied from the SCSICommand PDU).
		6.	The ExpCmdSN field is 22265 (1 more than the CmdSN field in the
			SCSICommand PDU, again acknowledging receipt of that command).
		7.	The DataSN field is 2 (third DataIn PDU sent in response to the
			SCSICommand request).
		8.	The BufferOffset field is 24576 (the attached data goes into this
			byte in the memory buffer on the initiator, immediately after the
			last byte sent in the previous DataIn PDU).

	5.	the target sends a 48 byte SCSIResponse PDU to the initiator:
		1.	The opcode field is 0x21 (for SCSIResponse).
		2.	The I bit is 1 (for a response PDU).
		3.	The F bit is 1 (always 1 on a SCSIResponse PDU).
		4.	The Response and Status fields are both 0 (Successful completion).
		5.	The DSL field is 0 (no bytes of data are attached to this PDU).
		6.	The ITT field is 88931 (copied from the SCSICommand PDU).
		7.	The StatSN field is 67 (the target's StatSN counter).
		8.	The ExpCmdSN field is 22265 (1 more than the CmdSN field in the
			SCSICommand PDU, thereby acknowledging receipt of that command).

	At this point, all 32768 bytes (64 logical blocks) of data have been
	successfully transfered from logical block numbers 18572 through 18635
	inclusive on the target's disk into the initiator's memory buffer (which
	is at an address given to the initiator by the SCSI subsystem on the
	initiator).  Furthermore, the initiator's CmdSN counter was updated in step
	1 above to now have the value 22265 (1 more than the CmdSN value used in
	the SCSICommand PDU), and the target's StatSN counter was updated in step
	5 above to now have the value 68 (1 more than the StatSN value used in the
	SCSIResponse PDU).  Finally, the initiator's ITT counter was updated in
	step 1 above to now have the value 88932, which will be unique for the
	next command to be sent by the initiator (remember, this does not have to
	be implemented as a counter).


5.1.3	QueueCommand

	The SCSI mid-level gets requests from higher levels of software, such as
	a file system, to READ or WRITE data blocks from/to a SCSI device.  Using
	information provided by the higher level, the SCSI mid-level figures out
	where in memory the data should be transfered to (on a READ) or from (on
	a WRITE), and which SCSI device and which data blocks on the device are
	involved in the transfer.  It then constructs a SCSI Command to perform the
	transfer and passes it to a "QueueCommand" function provided by each SCSI
	device driver.  In our case, the SCSI device driver is our
	"iscsi_initiator" module, and its "QueueCommand" function is the
	"iscsi_initiator_queuecommand" entry.  This function is always executed
	in the kernel, but within context of some process or thread.  It is never
	executed in an interrupt context.


5.1.3.1	Scsi_Cmnd

	When the SCSI mid-level wants to perform a READ operation, it creates
	a "Scsi_Cmnd" data structure and passes a pointer to it as the first
	parameter in a call to the "iscsi_initiator_queuecommand" function.
	This structure contains several important pieces of information:

	1.	the "target" field which is used by "iscsi_initiator_queuecommand"
		to associate this command with the appropriate session and connection.

	2.	the "cmnd" field which contains all the information needed by the target
		to perform the SCSI operation.  (This is usually called the "CDB", for
		"Command Description Block".) This field contains the information
		needed by the SCSI system on the target to perform the SCSI operation.
		It includes an "opcode" value of 0x28 for READ (10), a "lun" value
		(which duplicates the lun field mentioned next), the "logical block
		address" value (i.e., the number of the logical block on the disk where
		this read starts), the "length" value, which is the "request_bufflen"
		field (mentioned next) divided by 512 (i.e., the number of logical
		blocks to read, each logical block being 512 bytes), and finally a
		"control" value.

	3.	the "lun" field which contains the "Logical Unit Number" used by SCSI
		to identify devices.  This value duplicates a field in the CDB.

	4.	the "request_bufflen" field which contains the total number of bytes of
		data to be read (i.e., transfered from the target) by this command.
		This value is 512 times the value of a field in the CDB.

	5.	the "request_buffer" field, which indicates where in memory SCSI
		expects the data to be put as it is read from the target.
	
	6.	the "use_sg" field, which is 0 if the "request_buffer" field is the
		address in memory of the data buffer itself, and which is greater than
		0 if the "request_buffer" field is the address in memory of a
		scatter-gather list containing "use_sg" data buffers.

	7.	the "sc_data_direction" field, which is 1 if the command is generally
		categorized as a "write" command (i.e., if any data flows during this
		command, it will be from initiator to target), and which is 2 if the
		command is generally categorized as a "read" command (i.e., if any data
		flows during this command, it will be from target to initiator).

	8.	the "result" field, which is filled in by the initiator when it has
		finished processing this command completely.  The values that can be
		stored here are defined by SAM2.  The value 0 means "GOOD".  The
		others are various error codes which are defined as DID_XXX symbols
		in the file /usr/src/linux/drivers/scsi/scsi.h.  (DID_OK is defined
		as 0.)  The initiator will fill this field by opaquely copying the
		value in the Status field of a SCSIResponse PDU.  See section 3.4.2
		of draft 9 for a partial listing of these values.


5.1.3.2	struct command

	The "iscsi_initiator_queuecommand" function creates a new "struct command"
	data structure, initializes the fields, and adds it to the
	"pending_commands" list for the current connection.  The "struct command"
	contains a number fields that are initialized at this time.  Since we
	are considering a READ operation, only those fields relevant to a READ are
	discussed here.

	1.	the "SCpnt" field, which is initialized with a pointer to the
		"Scsi_Cmnd" structure.

	2.	the "init_task_tag" field, which is initialized with value of the
		session-wide "init_task_tag".  This value will be put into all PDUs
		sent/received during the processing of this command, as discussed
		above.

	3.	the "recvd_length" field, which is initialized to 0 and will be used by
		the initiator to keep track of the number of bytes of data received in
		one burst or sequence of DataIn PDUs.

	4.	the "data_offset" field, which is initialized to 0 and will be used by
		the initiator to keep track of where in its memory buffer to put the
		bytes of data received from the next DataIn PDU.

	5.	the "data_in_sn" field, which is initialized to 0 and will be used by
		the initiator to check the DataSN field of all DataIn PDUs sent by the
		target, as discussed above.


5.1.3.3	SCSI Command PDU

	"iscsi_initiator_queuecommand" then sets up the header of the "SCSI
	Command" PDU in the "iscsi_cmd" field of the "struct command".  This
	header contains the following fields:
		opcode 0x01,
		I bit 0
		F bit 1
		R bit 1 (for read)
		W bit 0 (no write)
		DSL 0 (no data in this PDU)
		the LUN copied from the "lun" field in the "Scsi_Cmnd" (see above),
		the ITT for this task (see above),
		the EDTL copied from the "request_bufflen" field in the "Scsi_Cmnd"
			(see above),
		the CmdSN for this command (see above),
		the ExpStatSN for this PDU (see above),
		the CDB copied from the "cmnd" field in the "Scsi_Cmnd" (see above),

	"iscsi_initiator_queuecommand" then attaches this newly initialized
	"struct command" to the end of the "pending_commands" list for this
	connection, and then does an "up" operation on the "tx_sem" semaphore for
	the session in order to signal the "tx_thread" that a PDU is ready for
	transmission to the target.


5.1.4	Tx_Thread

	Each iSCSI session contains one kernel thread that performs all
	transmissions of PDUs to the target.  This thread consists of an infinite
	loop that:
	
	1.	blocks on a "down" operation on the "tx_sem" semaphore for this session
		until some other process or thread does a corresponding "up" on this
		semaphore to indicate that a PDU is ready for transmission to the
		target.

	2.	for each connection open in this session, search its pending command
		list to see if there are any PDUs ready to be sent to the target.

	3.	for each PDU ready to be sent to the target, call "sock_sendmsg" to
		send it out over an open TCP connection to the target.

	4.	after a PDU has been sent, set its "tx_size" field to 0 to indicate
		that it has been sent.

	5.	after all PDUs for all connections in the session have been sent,
		return to step 1.

	Note that when a PDU is sent, tx_thread does not remove it from the
	"pending_commands" list for a connection because the information about the
	command must be retained until the initiator receives a response from the
	target.


5.1.5	Rx_Thread

	Each iSCSI connection contains one kernel thread that performs all
	receptions of PDUs from the target.  This thread consists of an infinite
	loop that:

	1.	Blocks on a call to sock_recvmsg waiting for the target to send a PDU
		to the initiator.  This call reads only the header of this PDU into
		an rx_buf allocated to this connection.

	2.	Uses the opcode field in the PDU header to call the appropriate
		function to deal with PDUs of this type.  If the opcode is 0x25
		(DataIn) then the function rx_data is called.  If the opcode is
		0x21 (SCSIResponse) then the function rx_rsp is called.

	3.	When the PDU has been processed by the appropriate function, return
		to step 1.


5.1.5.1	rx_data

	This function is called whenever the header for a DataIn PDU has been
	read in the rx_thread loop.  This function does the following:

	1.	Search the list of pending commands associated with this connection
		to find one whose init_task_tag field matches the ITT field in the
		DataIn PDU.  It is an error if no such command is found.

	2.	Extract the values from the BufferOffset and DSL fields in the DataIn
		PDU and check that the BufferOffset value in the DataIn PDU matches the
		data_offset value for this command, that the DSL field in the DataIn
		PDU does not exceed the value in the MaxRecvPDULength key sent by the
		initiator to the target, and that the total amount of data received
		by this command, including the data in this PDU, does not exceed the
		request_bufflen for this command.

	3.	Check that the DataSN value in the DataIn PDU matches the data_in_sn
		field of the command.

	4.	If all the checks are ok, increment the data_in_sn value for this
		command by 1, and the data_offset value for this command by the DSL
		value.

	5.	Finally, call sock_recvmsg to read the data attached to the DataIn
		PDU directly into the memory location indicated by the request_buffer
		field of the SCSI command (no extra copy required).


5.1.5.2	rx_rsp

	This function is called whenever the header for a SCSIResponse PDU has been
	read in the rx_thread loop.  This function does the following:

	1.	Search the list of pending commands associated with this connection
		to find one whose init_task_tag field matches the ITT field in the
		SCSIResponse PDU.  It is an error if no such command is found.

	2.	Check that the F-bit is set to 1 (required for a SCSIResponse PDU).

	3.	Check that the StatSN in the PDU matches the exp_stat_sn field
		for this connection and then increment the exp_stat_sn field by 1.

	4.	Check that the ExpCmdSn in the PDU matches the cmd_sn field for this
		connection.

	5.	Check that the O-bit (Overflow), U-bit (Underflow) and Residual Count
		fields agree with the request_bufflen of the original command and
		the data_offset (which gives the amount of data actually transfered).

	6.	Store the Status value from the PDU into the result field of the SCSI
		command, then call the SCSI Mid-level "done" function for this command.

	7.	Remove this command from the list of pending commands for this
		connection because it has now been completely processed.


5.1.6	Using MaxBurstSize

	The above discussion ignored the value of the key MaxBurstSize negotiated
	during login.  However, this cannot be ignored, because DataIn PDUs must
	be sent by the target in bursts or sequences having no more than
	MaxBurstSize total bytes.  Therefore, one READ command requiring more data
	than can be transfered in a single burst will be handled slightly
	differently that the "simple" READ involving only one burst that was
	discussed above.  This large READ command will require the transfer of
	several bursts or sequences, each of which will contain several DataIn PDUs
	as follows:

	1.	one "SCSI Command" PDU sent by the initiator to deliver the READ
		command information to the target,

	2.	some number of "DataIn" PDUs sent back by the target to deliver the
		data in the first burst to the initiator,

	3.	some number of "DataIn" PDUs sent back by the target to deliver the
		data in the second burst to the initiator,

	...

	4.	some number of "DataIn" PDUs sent back by the target to deliver the
		data in the last burst to the initiator,

	5.	one final "SCSI Response" PDU sent back by the target to deliver the
		final status to the initiator.

	Notice that there is no pause between bursts, and no extra PDU sent from
	either initiator or target to indicate the end of one burst or the
	beginning of another burst.  It is all based on the F-bit in the DataIn
	PDUs.  If we assume a burst contains N DataIn PDUs, then the F-bit in
	the first N-1 DataIn PDUs of the burst will have the F-bit set to 0,
	and the last (Nth) DataIn PDU of the burst will have the F-bit set to 1.

	Furthermore, the DataSN field in the DataIn PDUs is a counter that starts
	with 0 for the first DataIn PDU in the first burst, and increments by 1
	for each successive DataIn PDU in this command, regardless of which burst
	the DataIn PDU belongs to!  In other words, the DataSN field does not get
	reset to 0 for the first DataIn PDU of each burst -- it continues
	incrementing across burst boundaries, as if these boundaries did not exist.


5.1.6.1	PDU trace

	The following gives an abbreviated trace of the PDUs sent during a READ
	command that requires the transfer of 102400 bytes (200 blocks of 512
	bytes each) from the target to the initiator.
	(This is taken directly from the file "ty4" starting at line 1190.)

	Assumptions:
		1.	The command is a READ of 102400 bytes of data (200 blocks each
			containing 512 bytes).
		2.	The key MaxBurstSize=32768 was negotiated during login.
		3.	As above, the key MaxRecvPDULength=12288 was negotiated during
			login.

	Since each burst or sequence can contain at most MaxBurstSize=32768 bytes
	(64 logical blocks), the total READ of 102400 bytes (200 logical blocks)
	will require at least 4 bursts -- the first 3 bursts will each contain
	32768 bytes (64 logical blocks), for a total of 98304 bytes (192 logical
	blocks); the fourth burst will contain the remaining 4096 bytes (8 logical
	blocks) to give the required total of 102400 bytes (200 logical blocks).

	Because a single DataIn PDU can contain at most MaxRecvPDULength=12288
	bytes (24 logical blocks), the first 3 bursts will each contain 3 DataIn
	PDUs -- the first 2 DataIn PDUs will each have the F-bit set to 0 and will
	contain 12288 bytes (24 logical blocks), for a total of 24576 bytes (48
	logical blocks); the third DataIn PDU will have the F-bit set to 1 and will
	contain the remaining 8192 bytes (16 logical blocks) in the burst, giving a
	total of 32768 bytes (64 logical blocks) for the burst.  The last burst
	will consist of a single DataIn PDU containing 4096 bytes (8 logical
	blocks) of data.

	This is a grand total of 10 DataIn PDUs, and the DataSN field in these PDUs
	will increase linearly from 0 to 9, without regard to the boundaries
	between bursts (which is marked by the F-bit set to 1).

	The complete sequence of PDUs for this operation follows (note that all
	these PDUs carry the same ITT = 88905, and all the PDUs sent by the
	target contain ExpCmdSN=22239):

	Initiator->Target
		Opcode = 0x01;  F = 1;  R = 1;  DSL = 0;  EDTL = 102400;
			CmdSN = 22238;  ExpStatSN = 41
		
	Target -> Initiator
		Opcode = 0x25;  F = 0;  DataSN = 0;  DSL = 12288;  Buffer Offset = 0
			First DataIn PDU of first burst (12288 bytes total so far)
		Opcode = 0x25;  F = 0;  DataSN = 1;  DSL = 12288;  Buffer Offset = 12288
			Second DataIn PDU of first burst (24576 bytes total so far)
		Opcode = 0x25;  F = 1;  DataSN = 2;  DSL = 8192;  Buffer Offset = 24576
			Third (last) DataIn PDU of first burst (32768 bytes total so far)

		Opcode = 0x25;  F = 0;  DataSN = 3;  DSL = 12288;  Buffer Offset = 32768
			First DataIn PDU of second burst (45056 bytes total so far)
		Opcode = 0x25;  F = 0;  DataSN = 4;  DSL = 12288;  Buffer Offset = 45056
			Second DataIn PDU of second burst (57344 bytes total so far)
		Opcode = 0x25;  F = 1;  DataSN = 5;  DSL = 8192;  Buffer Offset = 57344
			Third (last) DataIn PDU of second burst (65536 bytes total so far)

		Opcode = 0x25;  F = 0;  DataSN = 6;  DSL = 12288;  Buffer Offset = 65536
			First DataIn PDU of third burst (77824 bytes total so far)
		Opcode = 0x25;  F = 0;  DataSN = 7;  DSL = 12288;  Buffer Offset = 77824
			Second DataIn PDU of third burst (90112 bytes total so far)
		Opcode = 0x25;  F = 1;  DataSN = 8;  DSL = 8192;  Buffer Offset = 90112
			Third (last) DataIn PDU of third burst (98304 bytes total so far)

		Opcode = 0x25;  F = 1;  DataSN = 9;  DSL = 4096;  Buffer Offset = 98304
			First (last) DataIn PDU of fourth burst (102400 bytes total so far)

		Opcode = 0x21;  F = 1;  DSL = 0;  Response = 0;  Status = 0x00
			StatSN = 41


5.1.6.2	Implementation

	The grouping of successive DataIn PDUs into bursts or sequences is
	controlled by the "recvd_length" field in the "struct command".  This is
	initialized to 0 in the initiator's "iscsi_initiator_queuecommand" function
	when the READ command is first set up.  Whenever it receives a DataIn PDU,
	the "rx_thread" will add the DSL of the DataIn PDU to the recvd_length
	field in the associated command and will check the result in the
	recvd_length field against the value of the negotiated MaxBurstSize key to
	ensure that the burst is not too long.  Once this has been done, if the
	F-bit in the DataIn PDU is set to 1, then the recvd_length field will be
	reset to 0 in preparation for the start of the next burst (if any) that may
	follow.  In all cases (F-bit set to 0 or 1), the data_in_sn field of the
	associated command is checked against the DataSN field in the DataIn PDU
	and is then incremented by 1, as was already discussed above.


5.1.7	Using Phase Collapse

	The preceding discussions assumed that "phase collapse" was not being used.
	"phase collapse" is the term used when the target decides to send the final
	status of an entire SCSI READ command in the last DataIn PDU it sends for
	that command and thereby no longer needs to send the final SCSIResponse
	PDU.  This can be done only if the Response field in the SCSIResponse PDU
	would have been 0, indicating "Command Completed at Target" (any other
	Response field value, such as 1 ("Target Failure"), requires the use of a
	SCSIResponse PDU).  This is also something that the target can choose to do
	or not at any time -- it is not negotiated, and the initiator must always
	be prepared to deal with it.

	The target indicates that is using "phase collapse" by setting the S-bit to
	1 in the final DataIn PDU it sends back to the initiator at the end of
	a READ command.  (In everything discussed so far, the S-bit in every DataIn
	PDU has been set to 0).  When this S-bit is set to 1, the information
	normally carried in the SCSIResponse PDU must now be carried in the DataIn
	PDU (in addition to all the information already carried in the DataIn PDU,
	of course).  This means that the following fields, which are normally
	reserved in a DataIn PDU, will now contain meaningful information:

		StatSN	- the target fills this field with the current value of its
				  StatSN counter, and will then increment this counter (just as
				  it normally does when sending a SCSIResponse PDU).

		Status	- the target fills this field with the same status value that
				  it would normally put in the Status field of a SCSIResponse
				  PDU, with the understanding that this value must indicate
				  a successful completion.  (The usual successful completion
				  value used is 0, meaning "GOOD".)

		O-bit	- the target sets this bit if it wanted to send more data than
				  the initiator asked for in the EDTL field of the SCSICommand
				  PDU.  In this case, the ResidualCount field will contain the
				  number of bytes of extra data that were not transfered.

		U-bit	- the target sets this bit if it could not send all the data
				  that the initiator asked for in the EDTL field of the
				  SCSICommand PDU.  In this case, the ResidualCount field will
				  contain the number of bytes of missing data that were not
				  transfered.

		ResidualCount	- set only if the either O-bit or U-bit is set.
						  0 otherwise.


5.1.7.1	PDU trace

	Using the same example discussed above to illustrate the use of
	MaxBurstSize during a READ operation, the only effect of a "phase collapse"
	is to eliminate the last SCSIResponse PDU and add extra information to the
	last DataIn PDU in the last burst.  The complete sequence of PDUs for this
	operation follows (note that all these PDUs carry the same ITT = 88905, and
	all the PDUs sent by the target contain ExpCmdSN=22239):

	Initiator->Target
		Opcode = 0x01;  F = 1;  R = 1;  DSL = 0;  EDTL = 102400;
			CmdSN = 22238;  ExpStatSN = 41
		
	Target -> Initiator
		Opcode = 0x25; F=0; S=0; DataSN = 0; DSL = 12288; Buffer Offset = 0
			First DataIn PDU of first burst (12288 bytes total so far)
		Opcode = 0x25; F=0; S=0; DataSN = 1; DSL = 12288; Buffer Offset = 12288
			Second DataIn PDU of first burst (24576 bytes total so far)
		Opcode = 0x25; F=1; S=0; DataSN = 2; DSL = 8192; Buffer Offset = 24576
			Third (last) DataIn PDU of first burst (32768 bytes total so far)

		Opcode = 0x25; F=0; S=0; DataSN = 3; DSL = 12288; Buffer Offset = 32768
			First DataIn PDU of second burst (45056 bytes total so far)
		Opcode = 0x25; F=0; S=0; DataSN = 4; DSL = 12288; Buffer Offset = 45056
			Second DataIn PDU of second burst (57344 bytes total so far)
		Opcode = 0x25; F=1; S=0; DataSN = 5; DSL = 8192; Buffer Offset = 57344
			Third (last) DataIn PDU of second burst (65536 bytes total so far)

		Opcode = 0x25; F=0; S=0; DataSN = 6; DSL = 12288; Buffer Offset = 65536
			First DataIn PDU of third burst (77824 bytes total so far)
		Opcode = 0x25; F=0; S=0; DataSN = 7; DSL = 12288; Buffer Offset = 77824
			Second DataIn PDU of third burst (90112 bytes total so far)
		Opcode = 0x25; F=1; S=0; DataSN = 8; DSL = 8192; Buffer Offset = 90112
			Third (last) DataIn PDU of third burst (98304 bytes total so far)

		Opcode = 0x25; F=1; S=1; DataSN = 9; DSL = 4096; Buffer Offset = 98304
			Status = 0x00;  StatSN = 41
			First (last) DataIn PDU of fourth burst (102400 bytes total so far)


5.1.7.2	Implementation

	On the initiator side, "phase collapse" is handled entirely within the
	"rx_data" function of the rx_thread.  After doing all the processing
	already described above for the rx_data function, the S-bit is checked,
	and if it is set to 1, then the function "do_scsi_response" is called.

	"do_scsi_response" performs the common processing for the SCSIResponse PDU
	and the DataIn PDU with phase collapse.  This has already been discussed
	above under the discussion of SCSIResponse PDU processing done by the
	rx_rsp function (steps 2 through 7 inclusive of that function).


6.0	FFP Testing

	In order to test Initiators and Targets on READ and WRITE commands in Full
	Feature Phase, it is obvious that the test scripts will have to be fairly
	long, since they have to complete a full login phase and get past the
	initial discovery steps in the Full Feature Phase (i.e., the
	"iscsi_config up" step).  Furthermore, since the initiator controls all
	transactions, the vendor will have to run programs on the initiator side
	that exhibit certain behavior patterns that the target test script will
	look for.  A good situation would be to have the target script simulate
	a target disk, and have the vendor's initiator mount that disk and then
	read/write large files from/to that disk.  In doing this you have to be
	sure that the data in the files is actually forced to disk (for example,
	Linux likes to keep data in its buffer cache, and only periodically
	flushes the cache to the actual disk.  Similarly, if data to be read is
	already in its buffer cache, Linux will avoid reading it from the disk
	again.  This makes Linux more efficient, but defeats the tests we want
	to run on the target!)  The Linux "sync" command should probably be
	used after every copy command to ensure that the data is forced from the
	buffer cache to disk.


6.1	Testing READ when DUT is an initiator

	In this situation, the test script will simulate a target.  This target
	will wait to receive a READ command from the initiator, and then will
	send responses to the initiator in order to test it.

6.1.1	Testing for correctness

	Clearly an important type of test will be a target script that checks all
	commands sent to it by the initiator (the formats are correct, the counters
	are all incrementing correctly, the various negotiated keys are being
	applied correctly, etc.) and then responds correctly to these commands.
	Thus this one script will verify that under "normal" circumstances, an
	initiator under test is correctly performing a whole set of testable items.

	Examples of this type of test follow.

6.1.1.1	Testing MaxRecvPDULength

	Testable item:  The amount of data attached to any PDU sent by the target
					must not exceed the MaxRecvPDULength key sent to the target
					by the initiator during login phase.

	Procedure:		Have the initiator agree to send a fairly small value for
					the MaxRecvPDULength key during the login phase (the target
					cannot negotiate this, so you will have to get the vendor
					to configure this value in his initiator).  Then when the
					target receives a READ SCSICommand with the EDTL greater
					than MaxRecvPDULength, reply with a sequence of DataIn PDUs
					each containing no more than MaxRecvPDULength bytes of data
					(but not more than the EDTL of the SCSICommand).

	Result:			Everything about this transmission should be correct.
					Therefore, the initiator should not detect any errors.

6.1.1.2	Testing MaxBurstSize and DataSN

	Testable Item:	The amount of data sent in a sequence of DataIn PDUs must
					not exceed the value of the MaxBurstSize key negotiated
					during login phase.  Furthermore, the DataSN field in a
					sequence of DataIN PDUs sent in response to a READ
					SCSICommand should start at 0 for the first DataIn PDU
					sent, and should increment by 1 for all following DataIN
					PDUs for this SCSICommand, regardless of the number of
					bursts needed to satisfy this SCSICommand.

	Procedure:		During login phase, negotiate a value for MaxBurstSize that
					is fairly small.  Then when the target receives a READ
					SCSICommand with the EDTL greater than MaxBurstSize,
					compute the number of DataIn PDUs needed to satisfy one
					burst, and the total number of bursts that are needed.
					Suppose there are N PDUs per burst, and B bursts.  Then
					send the first burst: N-1 PDUs, each containing
					MaxRecvPDULength bytes and having the F-bit set to 0,
					followed by the Nth PDU with the F-bit set to 1.  The
					DataSN field of the first PDU should be set to 0, and
					should be incremented by 1 in all subsequent DataIn PDUs.
					Then send the second burst, with the DataSN field in the
					first DataIn PDU of that burst simply continuing to
					increment from the value in the previous DataIn PDU,
					regardless of the burst structure.  Continue in this way
					for all subsequent bursts that are part of this
					SCSICommand.

	Result:			There are no errors in this transmission.  Therefore, the
					initiator should not detect any.


6.1.2	Testing for error detection

	Another important type of test will be a target script that is similar to
	the previous one, but which sends back to the initiator responses that
	contain errors.  We then observe the behavior of the initiator to see if it
	detects the error and takes the appropriate action.  Again, one script
	should simultaneously test a whole set of testable items.

	Examples of this second type of test follow.

6.1.2.1	Testing MaxRecvPDULength

	Testable item:  The amount of data attached to any PDU sent by the target
					must not exceed the MaxRecvPDULength key sent to the target
					by the initiator during login phase.

	Procedure:		Have the initiator agree to send a fairly small value for
					the MaxRecvPDULength key during the login phase (the target
					cannot negotiate this, so you will have to get the vendor
					to configure this value in his initiator).  Then when the
					target receives a READ SCSICommand with the EDTL greater
					than MaxRecvPDULength, reply with a DataIn PDU containing
					more than MaxRecvPDULength bytes of data (but not more than
					the EDTL of the SCSICommand).

	Result:			Except for the fact that the DataIn PDU contains too much
					data according to the MaxRecvPDULength key, everything else
					about this transmission should be correct.  Therefore, the
					initiator should detect the oversized DataIn PDU, but no
					other error, and should take appropriate action.  This is a
					Format Error, as defined in section 8.3 of draft 9.

6.1.2.2	Testing MaxBurstSize

	Testable Item:	The amount of data sent in a sequence of DataIn PDUs must
					not exceed the value of the MaxBurstSize key negotiated
					during login phase.

	Procedure:		During login phase, negotiate a value for MaxBurstSize that
					is fairly small.  Then when the target receives a READ
					SCSICommand with the EDTL greater than MaxBurstSize, reply
					with a sequence of DataIn PDUs, each having the F-bit set
					to 0 and each containing no more than MaxRecvPDULength
					bytes of data, yet with the total number of bytes being
					equal to the EDTL of the SCSICommand (and therefore greater
					than the MaxBurstSize).

	Result:			Except for the fact that the sequence of DataIn PDUs
					contain more than the MaxBurstSize amount of data, there
					are no other errors in this transmission.  Therefore, the
					initiator should detect that the sequence or burst is too
					long and take appropriate action.  This is a protocol
					error, as defined in section 8.8 of draft 9.

6.1.2.3	Testing DataSN

	Testable Item:	The DataSN field in a sequence of DataIN PDUs sent in
					response to a READ SCSICommand should start at 0 for the
					first DataIn PDU sent, and should increment by 1 for all
					following DataIN PDUs for this SCSICommand, regardless of
					the number of bursts needed to satisfy this SCSICommand.

	Procedure:		During login phase, negotiate a value for MaxBurstSize that
					is fairly small.  Then when the target receives a READ
					SCSICommand with the EDTL greater than MaxBurstSize,
					compute the number of DataIn PDUs needed to satisfy one
					burst, and the total number of bursts that are needed.
					Suppose there are N PDUs per burst, and B bursts.  Then
					send the first burst: N-1 PDUs, each containing
					MaxRecvPDULength bytes and having the F-bit set to 0,
					followed by the Nth PDU with the F-bit set to 1.  The
					DataSN field of the first PDU should be set to 0, and
					should be incremented by 1 in all subsequent DataIn PDUs.
					Then send the second burst, but reset the DataSN field in
					the first DataIn PDU of that burst to 0.  Continue in this
					way for all subsequent bursts that are part of this
					SCSICommand.

	Result:			Except for the fact that the sequence number of the DataIn
					PDUs is being reset at the start of every burst, there
					are no other errors in this transmission.  Therefore, the
					initiator should detect that the DataSN field is not
					correct and take appropriate action.  This is a sequence
					error, as defined in section 8.5 of draft 9.
