Page 212 - DCAP406_DCAP_207

Page 212 - DCAP406_DCAP_207_Computer Networks

P. 212

Unit 13: Session Layer and Presentation Layer

considers that the data stream may be transformed into a more compact representation. This Notes
compact data stream is reconstructed back into the original data at the destination machine.
Finite Set of Symbols: It is considered that a library with many branch offices in which the
previous days transactions are transmitted to every other branch after closing. Transactions are
comprised of checked out and returned books. The information could be exchanged in the
following manners:
1. The name of the book, its author, the copy number, etc. together with the type of transaction
are sent.
2. The library needs to maintain an office wise table assigning a unique ID number to every
book in every branch. Transactions then refer to the book’s ID number instead of its title.
As the book IDs are small and contain few bytes, so less data will be transmitted.
Note: It may be noted from the above descriptions that the above technique is used throughout
programming and pointers and array subscripts are frequently exchanged to avoid the cost of
transferring large amounts of data between subroutines. It is also assumed that all objects occur
with equal frequency and that the set of objects, books in this case, is finite. When text is
examined, it is immediately noticed that some words appear more often than others. Taking cue
from this, the number of bits could be reduced that are required to represent a document by
using a coding scheme that employs small code words to represent common words and longer
code words to represent words that appear infrequently.
Huffman Encoding: Huffman encoding is used to encode symbols according to the frequency of
their use which is explained as below:

1. A set of nodes, one node per symbol with a node’s value given by the probability of its
occurrence in the data is created.
2. The two nodes having the smallest value are found. They are removed from the set and a
new node having the two removed nodes as children are created. The new node is then
assigned a value equal to the sum of its children’s values. The new node is added back to
the set of nodes.
3. Step 2 is repeated until only one node remains. This generates a tree, whose probability
value is one.
4. The encoding for each symbol is done that is the path from the root to the symbol. A code
0 is used for a left child and 1 for a right child. Thus, the length of each symbol’s encoding
is proportional to the relative probability of its occurrence.

Did u know? The disadvantage of the Huffman encoding is that symbols have differing
lengths so it becomes relatively expensive to decode. Also, a single-bit error will corrupt
the entire message.
Context Dependent Encoding: It recognizes that the probability of a particular symbol occurring
next depends on the previous symbol. For instance, the probability that a P directly follows M
is about 4 times less than the probability of a Q following M. The drawback of conditional
probability methods is the increase in table space. Each symbol needs its own table to give the
codes for those symbols immediately following it. However, this approach has an advantage
over Huffman encoding. It is that symbols are all fixed length and therefore makes encoding
and decoding using table lookups very efficient. It is also more immune to transmission errors.
Run Length Encoding: Run length encoding is an alternative to encode data containing repetitive
symbols. Let us assume binary strings of 0s and 1s. The long runs of 0 are handled by using a
k-bit symbol that indicates how many 0 bits occurred between consecutive 1s. A code word of all

LOVELY PROFESSIONAL UNIVERSITY 205

207 208 209 210 211 212 213 214 215 216 217