Ryan S. Barnett
June 28, 2001
University of Southern California

PGM HOWTO

A simplified version of the Pragmatic General Multicast (PGM) Reliable
Transport Protocol has been provided for ns 2.1b8.  This implementation
conforms to a subset of the specification provided by IETF Draft v6 by
Speakman et al.  The implementation and other PGM related files are contained
in the ~ns/pgm directory.  The default OTcl parameters for all PGM objects
are specified in ~ns/tcl/lib/ns-default.tcl.

1. What is supported?

All general PGM procedures are supported, including at least the following:

Senders:
   a. Multiple PGM senders on a single network,
   b. RDATA generation,
   c. NAK reliability,
   d. Source Path State generation.

Network Elements:
   a. Source Path State processing,
   b. NAK reliability,
   c. Constrained NAK forwarding,
   d. NAK elimination,
   e. Constrained RDATA forwarding,
   f. NAK Anticipation.

Receivers:
   a. NAK suppression (with a random back-off interval),
   b. NAK reliability.

2. What is NOT supported?

This implementation of PGM does not support the following features described
in the PGM specification.  These are left for future enhancements:

   a. PGM Options: These include fragmentation, late joining, redirection,
      Forward Error Correction, reachability, and session control.

   b. Designated Local Repairer (DLR) support.

   c. Congestion control techniques.

   d. Transmit Windows and Receive Windows. The sender is assumed to have an
      infinitely large buffer to provide a repair retransmission for any
      sequence number.

3. PGM Agents Overview

Three PGM agents are available to be used within the Tcl simulation 
environment: Agent/PGM, Agent/PGM/Sender, and Agent/PGM/Receiver.

Agent/PGM provides the "network element" functionality.  This allows the
node to intercept intermediate PGM packets, not designated to that node, and
process them accordingly.  (The required behavior is similar to the
IP Router Alert option that must be used by PGM routers.)  The typical
behavior of Agent/PGM includes: source path state processing, NAK
confirmation, constrained reliable NAK forwarding, and constrained RDATA
forwarding.

Agent/PGM/Sender provides the functionality for a node to be the source of
packets for a PGM session.  An Application is run on top of this agent 
such as Application/Traffic/CBR.  The typical behavior of Agent/PGM/Sender
includes: heartbeat SPM generation, NAK confirmation, and delayed repair
transmission.

Agent/PGM/Receiver provides the functionality for a node to be a receiver
of a PGM session.  The typicial behavior of Agent/PGM/Receiver includes:
constrained NAK generation and NAK retransmission.

NOTE: A single node can have both the Agent/PGM and Agent/PGM/Receiver
(or Agent/PGM/Sender) attached to the node.  In fact,
when PGM is activated in the Simulator, an Agent/PGM is automatically created
and attached to _every_ node created thereafter.  This is required to
allow the node to intercept PGM packets (aka router-alert) if it is to be
a "network element".  Hence, the user never explicitly creates Agent/PGM
agents.  A command can be issued to the node to disable its
Agent/PGM, if desired, to simulate an environment with non-PGM routers.  It
is perfectly normal for a node to be both an Agent/PGM and a Receiver,
with the Agent/PGM enabled, regardless if it is a leaf node or not.

4. PGM Agent Settings

The following are tunable parameters that are available for the three PGM
agent types.  The default settings are located in ~ns/tcl/lib/ns-default.tcl.

   a. Agent/PGM

      pgm_enabled_: [0 or 1, default: 1] This is used to toggle whether the
         given agent is active.  When set to 0 the node simply forwards
         packets to the next node or agent without any PGM processing.

      nak_retrans_ival_: [default: 50ms] The amount of time the
         agent waits between retransmitting a NAK that it is waiting for a NCF
         packet.

      nak_rpt_ival_: [default: 1000ms] This is the amount of time the network
         element will continue to repeat NAKs while waiting for a
         corresponding NCF.  Once this time expires and no NCF is received,
         then the entire repair state is removed for that sequence number.

      nak_rdata_ival_: [default: 10000ms] This is the length of time the
         network element will wait for the corresponding RDATA before removing
         the entire repair state.

      nak_elim_ival_: [default: 5000ms] Once a NAK has been confirmed, the
         network elements must discard all further NAKs for up to this length
         of time.  This should be a fraction of nak_rdata_ival_.

   b. Agent/PGM/Sender

      spm_interval_: [default: 500ms] The length of time to wait between
         sending SPM packets.

      rdata_delay_: [default: 70ms] The length of time to delay sending out
         an RDATA in response to a NAK packet.  This is to allow slow NAKs to
         get processed so we don't send out duplicate RDATA.  This delay
         should not exceed twice the greatest propagation delay in the loss
         neighborhood.

   c. Agent/PGM/Receiver

      max_nak_ncf_retries_: [default: 5] Maximum number of times the receiver
         can send out a NAK and time-out waiting for an NCF reply.  Once the
         receiver hits this many retries, it discards the NAK state entirely
         and suffers permanent data loss.

      max_nak_data_retries_: [default: 5] Maximum number of times we can
         time-out waiting for RDATA after an NCF confirmation for a NAK
         request.  Once the receiver hits this many retries, it discards the
         NAK state entirely and suffers permanent data loss.

      nak_bo_ivl_: [default: 30ms] The random back-off interval.  The
         receiver will select a random amount of time no greater than this
         value, before it will send out a NAK packet when detecting a gap in
         the data stream.  It is during this time that the receiver is
         looking for an NCF from another node that might have detected the
         gap first.

      nak_rpt_ivl_: [default: 50ms] The amount of time to wait for a NCF
         packet after sending out a NAK packet to the upstream node.  If no
         NCF is received, another random back-off time is observed, and then
         the NAK is retransmitted.

      nak_rdata_ivl_: [default: 1000ms] The amount of time to wait for RDATA
         after receiving an NCF confirmation for a given NAK.  Once this timer
         expires, another random back-off time is observed, and then the NAK
         is retransmitted.

5. Using PGM

PGM requires that multicast be enabled, therefore your Tcl simulation script
needs to initialize the ns object with the following statement:

   set ns [new Simulator -multicast on]

To allow nodes to intercept PGM packets and process them, the following
statement must be used:

   $ns node-config -PGM ON

This will instruct ns to implicitly create an Agent/PGM agent for every
new node that is created.  If you want to deactivate an Agent/PGM for a node,
(for example to simulate a non-PGM router), you must extract the agent
from the node using "get-pgm", and then set pgm_enabled_ to 0.  Here is an
example:

   set node1 [$ns node]
   set pgm_agent1 [$node1 get-pgm]
   $pgm_agent1 set pgm_enabled_ 0

Note that it is perfectly fine to keep an Agent/PGM enabled even if it has
a Receiver or Sender attached to it.

Create the multicast group for your PGM session, for example:

   set group [Node allocaddr]

Create the Sending PGM agent, for example:

   set src [new Agent/PGM/Sender]
   $ns attach-agent node2 $src
   $src set dst_addr_ $group
   $src set dst_port_ 0

Attach the Constant Bit Rate traffic source to the PGM Sender, for example:

   set cbr [new Application/Traffic/CBR]
   $cbr attach-agent $src
   $cbr set rate_ 448Kb
   $cbr set packetSize_ 210
   $cbr set random_ 0

Create the PGM Receiver agents and attach them to the nodes that should act
as receivers, for example:

   set rcv3 [new Agent/PGM/Receiver]
   $ns attach-agent $node3 $rcv3
   $ns at 0.01 "$node3 join-group $rcv3 $group"

You must of course attach the links to the nodes using duplex-link, and
set the routing protocol.  You may also want to add loss modules to
have packets be dropped from various links.  See the examples in the
~ns/pgm/tcl directory for further information.

In order to start the simulation you should first have the PGM Sender
begin to send out heartbeat SPM packets.  These will initialize the
PGM nodes with source path state.  This allows the nodes to know where
to send NAK packets in the event of a packet drop.  The SPM's should be
propagated throughout the network before ODATA is sent from the PGM Sender.
This is done such as:

   $ns at 0.3 "$src start-SPM"

Now you can activate the CBR traffic source, such as:

   $ns at 0.4 "$cbr start"

To finish the simulation you should first terminate the CBR, and then 
terminate the heartbeat SPM packets from the PGM Sender.  For example:

   $ns at 1.5 "$cbr stop"
   $ns at 2.0 "$src stop-SPM"

And then call the finish procedure:

   $ns at 2.0 "finish"

You can then gather PGM statistics on the results of the simulation through
the finish procedure.  To do this you issue the command "print-stats" on the
desired Agent/PGM, Agent/PGM/Sender, or Agent/PGM/Receiver.  Remember
that Receiver and Sender nodes may also have an Agent/PGM, so you will need
to execute two print-stats for a single node.  For example,

   proc finish {} {
      ...

      $src print-stats

      set pgm_agent2 [$node2 get-pgm]
      $pgm_agent2 print-stats

      set pgm_agent3 [$node3 get-pgm]
      $pgm_agent3 print-stats

      $rcv3 print-stats
 
      ...
   }

The statistics that are printed out for Agent/PGM/Sender look like the
following:

   pgmSender-0
           Last ODATA seqno: 266
           Last SPM seqno: 3
           Number of NAKs received: 27
           Number of RDATA transmitted: 27
           Max retransmission count for a single RDATA: 0

The first line is a unique identifier of the Sender.  Every time that a
"new Agent/PGM/Sender" statement is executed the counter increments by one.
The first sender will have a unique identifier of "pgmSender-0".

The next line indicates the sequence number of the last packet transmitted
from this source.  Then we have the last sequence number of the SPM packet
that was transmitted.  Followed by the number of NAK packets that this
source received and the number of RDATA transmitted.

The last line indicates the maximum number of retransmissions of RDATA for
any particular sequence number.

The statistics that are printed out for Agent/PGM look like the following:

   pgmAgent-0:
           NAKs Transmitted:       27
           NAKs Suppressed:        0
           Unsolicited NCFs:       0
           Unsolicited RDATA:      0

The first line is the unique identifier of this agent, it is determined by
the order of node creation.  The second line indicates the number of NAKs
that were transmitted upstream.  The third line is the number of NAKs that
were not acted upon because previous NAK state exists already for that NAK.
The fourth line is the number of NCF packets that were received for a
sequence number when a NAK was not transmitted, this quantifies the NAK
anticipation functionality.  The last line indicates the number of extra
RDATA that the agent received.  This occurs if an upstream router does not
support PGM.

The statistics that are printed out for Agent/PGM/Receiver look like the
following:

   pgmRecv-0:
           Last packet:            266
           Max packet:             266
           Packets recovered:      27
           Latency (min, max, avg):        0.134128, 0.156994, 0.144058
           Total NAKs sent:        27
           Retransmitted NAKs:     0

The first line is the unique identifier of this agent.  The next line is the
last sequence number received, followed by the maximum contiguous packet
received.  The next line is the number of packets recovered by RDATA,
followed by the average latency to recover those packets.  The next line
indicates the total number of NAKs that were sent, followed by the
number of NAKs that were retransmitted due to timeouts.

6. Using the PGM Error Model

The PGMErrorModel allows the user to specify which packets should be lost
on a given link during the simulation.  The interface to this model is
similar to that of the Periodic Error Model.  You use the procedure
"drop-packet" with the first argument being the type of PGM packet you
would like dropped, followed by the cycle period, and finally the offset
within each period.

To specify the type of PGM packet, use one of the following
strings: SPM, ODATA, RDATA, NAK, or NCF.

Here is an example of how to drop the fifth ODATA packet that crosses the
link from Node 1 to Node 2, and continue to drop the fifth packet
when the next 10 ODATA packets cross the link:

   set loss_module [new PGMErrorModel]
   $loss_module drop-packet ODATA 10 5
   $loss_module drop-target [$ns set nullAgent_]
   $ns lossmodel $loss_module $node1 $node5

Note that the packet drop cycle is counted by the number of packets that
cross the link that are the given PGM packet type.  It is not dependant on
the sequence number contained within the packet.

The PGMErrorModel only allows one type of PGM packet to be dropped.  
Other error models are located in ~ns/errmodel.cc if you need a more
sophesticated error modeling capability.