Page 191 - DCAP303_MULTIMEDIA_SYSTEMS
P. 191
Unit 11: Compression
11.4 redundancies and its types notes
The term data compression refers to the process of reducing the amount of data required to
represent a given quantity of information. A clear distinction must be made between data and
information. They are not synonymous. In fact, data are the means by which information is
conveyed. Various amounts of data may be used to represent the same amount of information.
Such might be the case, for example, if a long-winded individual and someone who is short and
to the point where to relate the same story. Here, the information of interest is the story; words are
the data used to relate the information. If the two individuals use a different number of words to
tell the same basic story, two different versions of the story are created, and at least one includes
nonessential data. That is, it contains data (or words) that either provide no relevant information
or simply restate that which is already known. It is thus said to contain data redundancy. Data
redundancy is a central issue in digital image compression. It is not an abstract concept but a
mathematically quantifiable entity. If n and n denote the number of information-carrying units
1
2
in two data sets that represent the same information, the relative data redundancy RD of the first
data set (the one characterized by n ) can be defined as
1
R = 1 − C 1
D
R
where C , commonly called the compression ratio, is
R
C = n n 2 1
R
For the case n = n , C = 1 and R = 0, indicating that (relative to the second data set) the first
D
2
R
1
representation of the information contains no redundant data. When n << n , C → ∞ and
1
R
2
R → 1, implying significant compression and highly redundant data. Finally, when n >> n ,
D
1
2
C → 0 and R → ∞, indicating that the second data set contains much more data than the original
R
D
representation. This, of course, is the normally undesirable case of data expansion. In general,
C and R lie in the open intervals (0, ∞) and (–∞, 1), respectively. A practical compression ratio,
D
R
such as 10 (or 10 : 1), means that the first data set has 10 information carrying units (say, bits) for
every 1 unit in the second or compressed data set. The corresponding redundancy of 0.9 implies
that 90% of the data in the first data set is redundant.
In digital image compression, three basic data redundancies can be identified and exploited:
Coding redundancy, interpixel redundancy and psychovisual redundancy.
11.4.1 Coding redundancy
In this, we utilize formulation to show how the grey-level histogram of an image also can
provide a great deal of insight into the construction of codes to reduce the amount of data used
to represent it.
Let us assume, once again, that a discrete random variable r in the interval [0, 1] represents the
k
grey levels of an image and that each r occurs with probability p (r ).
k
k
r
P (r ) = n n k k = 0, 1, 2, ..., L – 1
k
r
where, L is the number of grey levels, n is the number of times that the k th grey level appears in
k
the image, and n is the total number of pixels in the image. If the number of bits used to represent
each value of r is l (r ), then the average number of bits required to represent each pixel is
k
k
L − 1
()
L avg = ∑ lr pr().
k
r
k
k = 0
That is, the average length of the code words assigned to the various grey-level values is found by
summing the product of the number of bits used to represent each grey-level and the probability
that the grey-level occurs. Thus, the total number of bits required to code an M X N image is
MNL avg .
LoveLy professionaL University 185