Tuto1

Q1: A color image is in raw RGB format. Its bit depth is 12 bits per color component and its spatial resolution is 1024×480. How many bytes are required to store this image if no compression is performed?

No. of bytes = 1024×480×12×3/8 = 2,211,840 bytes

Rule:

$\text{No.of bytes} = \frac{\text{width} \times \text{height} \times \text{bit depth} \times \text{number of color channels}}{8}$

Q2: What color would a magenta paper appear when exposed to white light?

Material reflects R and B & there are R, G and B → R & B = Magenta

Rule:

Look at Additive primaries.

When a Material Reflect $\{X\}$ Colors, and Expose to $\{Y\}$ Colors Light,

Material would appear $\{X \cap Y\}$ Color. (If $\{X\}$ and $\{Y\}$ has no intersection, Black Color appear.)

Q3: Given 3 Colors whose RGB representations are given as follows:

Color A: (0.5, 0.5, 0.5)

Color B: (0.4, 0.6, 0.5)

Color C: (0.3, 0.7, 0.5)

$H=\left\{\begin{array}{ccc} \theta & \text {if } \quad B \leq G \\ 2 \pi-\theta & \text { if } \quad B>G \end{array}\right.$

$\text { where } \quad \theta=\cos ^{-1}\left\{\frac{\frac{1}{2}[(R-G)+(R-B)]}{\left[(R-G)^{2}+(R-B)(G-B)\right]^{1 / 2}}\right\}$

$S=1-\frac{3}{R+G+B}[\min (R, G, B)]$

$I=\frac{1}{3}(R+G+B)$

Q3(a): Which Color does not carry chrominance (Color) Information?

Saturation indicates the spectral purity of the color in the light (Chrominance)

Color A -> H = 90°, S = 0, I = 0.5 -> S=0, No Chrominance Information

Rule:

Chrominance Information depends on S.

$\mathrm{S}=1-\frac{3}{\mathrm{R}+\mathrm{G}+\mathrm{B}}[\min (\mathrm{R}, \mathrm{G}, \mathrm{B})]$

Q3(b): Which Color has a Stronger Luminance (Intensity of Light)?

Color B -> I = 0.5

Color C -> I = 0.5

All color has a same Luminance.

Rule:

Luminance depends on I.

$I = \frac{1}{3}(R+G+B)$

Q3©: What’s the predominant spectral color of Color C?

Value of R, G, B, C, M, Y of Color C:

R = 0.3, G = 0.7, B = 0.5, C = 1-R = 0.7 , M = 1-G = 0.3 , Y = 1-B = 0.5

max(R,G,B,C,M,Y) = 0.7 (Green and Cyan)

Color C is biased to middle of Green and Cyan ( or just Green among R, G and B).

Rule:

Predominant spectral color depends on H or simply watch the RGB value.

Q4: The bit depth of a grey-level image is 4. Mark all its possible pixel colors in the RGB space.

$2^4 = 16$ coordinates

The RGB coordinates are in a form of ( $\frac{k}{15}$ , $\frac{k}{15}$ , $\frac{k}{15}$ ) where $k \in (0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)$

Rule:

Grey-level means RGB are the same value.

The bit depth determines the number of pixel colors.

Tuto 2

The first step is to reconstruct them $C_b$ and $C_r$ to 4:4:4.

$C_b$ :

$\begin{array}{|llll|} \hline 87 & 87 & 67 & 67 \\ 23 & 23 & 32 & 32 \\ 31 & 31 & 30 & 30 \\ 20 & 20 & 10 & 10 \\ \hline \end{array}$

$C_r$ :

$\begin{array}{|llll|} \hline 87 & 87 & 67 & 67 \\ 23 & 23 & 72 & 72 \\ 12 & 12 & 14 & 14 \\ 98 & 98 & 32 & 32 \\ \hline \end{array}$

4:1:1

Note the numbers are rounded up. (Example 0.5 -> 1)

$\begin{array}{ll}C_b \\ 77 \\ 28 \\ 31 \\ 15\end{array}$

$\begin{array}{ll}C_r \\ 77 \\ 48 \\ 13 \\ 65\end{array}$

4:2:0

Note the numbers are rounded up. (Example 0.5 -> 1)

$\begin{array}{ll}C_b \\ 55 & 50 \\ 26 & 20 \end{array}$

$\begin{array}{ll}C_r \\ 55 & 70 \\ 55 & 23 \end{array}$

Median Filtering of Y

Note the numbers are rounded up. (Example 0.5 -> 1)

Apply a 3x3 filter to get all the values inside the mask. For outbounded window, we ignore the number.

During Median Filtering, the sequence are:

(24,45,65,81) (24,25,45,65,65,81) (23,25,65,65,81,99) (23,25,65,99)

(24,27,32,45,65,81) (24,25,27,32,45,65,65,81,89) (10,23,25,27,65,65,81,89,99) (10,23,25,65,89,99)

(15,24,27,32,34,81) (15,24,27,32,34,65,81,89,134) (10,23,27,34,65,81,89,99,134) (10,23,65,89,99,134)

(15,27,32,34) (15,27,32,34,89,134) (10,23,27,34,89,134) (10,23,89,134)

Therefore after median filtering, the output of the Y plane is:

$\begin{array}{|llll|} \hline 55 & 55 & 65 & 45 \\ 39 & 45 & 65 & 45 \\ 30 & 34 & 65 & 77 \\ 30 & 33 & 31 & 56 \\ \hline \end{array}$

(24,45,65,81)	(24,25,45,65,65,81)	(23,25,65,65,81,99)	(23,25,65,99)
(24,27,32,45,65,81)	(24,25,27,32,45,65,65,81,89)	(10,23,25,27,65,65,81,89,99)	(10,23,25,65,89,99)
(15,24,27,32,34,81)	(15,24,27,32,34,65,81,89,134)	(10,23,27,34,65,81,89,99,134)	(10,23,65,89,99,134)
(15,27,32,34)	(15,27,32,34,89,134)	(10,23,27,34,89,134)	(10,23,89,134)

Area of $T_1$ = Area of $T_2$

Let $t$ be the area, $t_1 = t_2$

$t_1 = \int^{r_1}_0 p_r(r) dr$

$t_2 = \int^{z_1}_0 p_z(z) dz$

$\text { Slope } m=\frac{y_{2}-y_{1}}{x_{2}-x_{1}}$

To find the equation of $p_r(r)$ :

Coordinates of $p_r(r)$ : (0,2), (1,0)

$\frac{0-2}{1-0} = m = -2$

y = mx + c

Sub(1,0) and m = -2 into equation, we get

0 = -2(1) + c

c = 2

y = -2x + 2

equation of $p_r(r) = 2-2r$

To find the equation of $p_z(z)$ :

y = mx + c

Sub (1,2) and (0,0):

2 = m + c

0 = c

m = 2

equation of $p_z(z) = 2z$

Then carry the integration

$t_1 = \int^{r_1}_0 p_r(r) dr = \int^{r_1}_0 (2-2r)dr = (2−r_1)r_1 = 2r_1 - r_1^2$

$t_2 = \int^{z_1}_0 p_z(z) dz = \int^{z_1}_0 (2z)dz = z_1^2$

$z = \sqrt{2r-r^2}$

Originally, 3 bit (0-7)

now increasing resolution to 8bit(0-255)

However, we are equalizing its histogram simultaneously at the same time

To get total probability = 1, We need the cumulative frequency count.

Then (cumulative frequency count / total count) -> Normalization

Finally use the normalized value $\times (2^{bit}-1)$ to get the output level.

input level to output level:

0 : $\frac{3}{24} \times 255 = 31.875 = 32$ (Rounded off)

1 : $\frac{8}{24} \times 255 = 85$

2 : $\frac{12}{24} \times 255 = 127.5 = 128$ (Rounded off)

3 : $\frac{19}{24} \times 255 = 201.875 = 202$ (Rounded off)

4 : $\frac{19}{24} \times 255 = 201.875 = 202$ (Rounded off)

5 : $\frac{20}{24} \times 255 = 212.5 = 213$ (Rounded off)

6 : $\frac{21}{24} \times 255 = 223.125 = 223$ (Rounded off)

7 : $\frac{24}{24} \times 255 = 255$

Therefore:

$\begin{array}{|llllll|} \hline 202 & 255 & 128 & 85 & 128 & 202 \\ 202 & 85 & 202 & 85 & 255 & 202 \\ 32 & 32 & 85 & 223 & 202 & 128 \\ 255 & 32 & 213 & 85 & 128 & 202 \\ \hline \end{array}$

Tuto 3

Q1: Determine which of the following operators is for detecting horizontal(vertical) lines.

Rule:

How about other angles?

Horizontal line + 45° line = 22.5°

Horizontal line + -45° line = -22.5°

Vertical line + 45° line = 67.5°

Vertical line + -45° line = -67.5°

Q2:

First count the frequency of all Intensity

Use average intensity for initial thesholding value = 6.375

Then split into 2 regions using the theshold, further count the frequency of all Intensity

Compute a new threshold T =( $\mu_1+\mu_2$ )/2 = 6.25

Repeat until the difference in T in successive iteration is smaller than a certain value.

The final theshold = 6.25

Q3:

Maximum intensity difference in the region > 1 then split

Firstly, do the quad-tree decomposition

Then draw the complete quad tree: (Note you need to annoted the tree which region is which)

Merge the regions to form the final segmentation result (select the largest region first, from left to right). Use the average intensity of a region in merging the regions.

Therefore: Start from:

Then finally:

Q4:

Seed point (Starting point) is selected by the user (usually highest or smallest intensity)

Select a threshold

Then start growing (follow checking order)

Tuto 4

Q1: Use the splitting technique to obtain a polygon for representing a circle. The splitting criterion is that the maximum distance of the boundary from the corresponding side of the approximated polygon is large than $r(1-1 / \sqrt{2})$ , where $r$ is the radius of the circle.

Splited once

Calculate $r \times cos45 = r/\sqrt2$

distance from the farthest boundary pixel = $1 - r/\sqrt2 = r(1 - \frac{1}{\sqrt2})$

threshold = $r(1-1/\sqrt2)$ (given in question)

Since distance from the farthest boundary pixel NOT > threshold, No further spliting

Q2:

(i)
By just looking at the first 3 components:

A : (80,40,0)

B : (-20, 60, -20)

C : (0, 113, 0)

D : (35, 129, 0)

Compare to the fourier descriptors of a circle (0,40,0):

C is a likely a Scaled circle

A and D are likely a Translated circle (because DC component appeared (the first component))

(ii)

Used IDFT to calculate the Fourier Descriptor of Object B.

IDFT([-20,60,-20,20,-20,21,-20,20])

Coordinates are :

Used IDFT to calculate the Fourier Descriptor of Object D.

Final should be something like this:

(iii)

From the Graph, it is D.

From the diameter, it is also D.

Q3:

Correct Answer:

You may ask, why is it not a striaght line instead?

Because only if there are more than 1 shortest paths (shortest distance) from the pixel centre to the boundary, we mark it as medial axis (skeleton of the region).

So this one (the horizonal red line) is a wrong MAT.

Wrong Answer (and why it is wrong):

Q4:

Chain code starting at the top:

{67066434311}

First Do the difference, then do the starting point

Normalized with respect to the orientation of the object:

Consider the chain code {67066434311}

Therefore Difference = {51160671760}

Normalized with respect to the starting point = minimum Integer

Using the Normalized with respect to the orientation of the object {51160671760},

Therefore Normalized with respect to the starting point = {05116067176}

Tuto 5

Q1: A series of messages is to be transferred between two computers over a PSTN. The messages comprise just the characters A through H. Analysis has shown that the probability (relative frequency of occurrence) of each character is as follows:

A and B = 0.25,

C and D = 0.14,

E, F, G, and H = 0.055

Q1(a): By computing the entropy of the source, derive the minimum average number of bits per character.

Entropy, $H$ , of $N$ random variables $W_i$ , where $i = 1,2...N$ , with the corresponding probabilities $P_i$ ’s is given by:

$H=-\sum_{i=1}^{N} P_{i} \log _{2} P_{i}$

Then the entropy of these 6 inputs:

$N = 8$

$H=-\sum_{i=1}^{8} P_{i} \log _{2} P_{i}$

$H=- (2\times (0.25 \log _{2} 0.25) + 2\times (0.14 \log _{2} 0.14) + 4\times (0.055 \log _{2} 0.055))$

$H = 2.715$ bits

Q1(b): Use Huffman coding to derive a codeword set.

we first sort them by Probability (From highest to lowest).

In each step, we combine the lowest probabilty with second lowest probability. Then sort it again to form different stage

And then we can start to assign codewords from shortest to the longest.

In the last stage, we assign a higher prob with 0, lower prob with 1.

Break it into 2, Assign 1 to high probability and Assign 0 to low probability

If the element is not alone, further break it until it is alone.

If the element is alone, the assigned bits is the codeword.

For each symbol: $\text{prob} \times \text{assigned bits}$

Then add them all together.

$2(2\times0.25)+2(3\times0.14)+4(4\times0.055) = 2.72$ bits/codeword

Q2:

Q2(a) find the output bitstream for each 4x4 block.

First apply zigzag scan to get the 1-dimensional array.

The first term is the DC DIFF, other terms are AC.

^if the last ACs are 0s, just denote as EOB

DC DIFF means (this Block DC - previous Block DC)

Then turn them into run-level representation.

For DC:

$\text{AMPLITUDE} = \text{DC DIFF}$

$\operatorname{SIZE}=\lceil\log _{2}(|x|+1)\rceil$

the run-level representation is (SIZE,AMPLITUDE).

For AC:

the run-level representation is (number of 0s, SIZE)(AC number).

^for EOB, just denote as (0,0)

Finally, map them according the table.

For DC:

AMPLITUDE as index => Match the DC Coding table

SIZE as binary (If SIZE is -ve, 1’s complement it)

For AC:
(number of 0s, SIZE) => Match the AC Coding table

AC number as binary (If AC number is -ve, 1’s complement it)

Then Join the DC with all the AC, you get the output bitstream.

Q2(b): Determine the compression ratio for each 4x4 block.

First find the No. of bits required for the original block

$\text{No. of bits required for the original block} = \text{block size} \times 8$

In this case, 4x4 block therefore 16x8 = 128 bits.

Then find the number of bits of bitstream (answer from part A) in each block

Finally find the Compression ratio.

$\text{Compression ratio} = \frac{\text{No. of bits required for the original block}}{\text{the number of bits of bitstream in a block}}$