Image and Audio Processing - Questions and Answers [1]
Tuto1
Q1: A color image is in raw RGB format. Its bit depth is 12 bits per color component and its spatial resolution is 1024×480. How many bytes are required to store this image if no compression is performed?
No. of bytes = 1024×480×12×3/8 = 2,211,840 bytes
Rule:
Q2: What color would a magenta paper appear when exposed to white light?
Material reflects R and B & there are R, G and B → R & B = Magenta
Rule:
Look at Additive primaries.
When a Material Reflect Colors, and Expose to Colors Light,
Material would appear Color. (If and has no intersection, Black Color appear.)
Q3: Given 3 Colors whose RGB representations are given as follows:
Color A: (0.5, 0.5, 0.5)
Color B: (0.4, 0.6, 0.5)
Color C: (0.3, 0.7, 0.5)
Q3(a): Which Color does not carry chrominance (Color) Information?
Saturation indicates the spectral purity of the color in the light (Chrominance)
Color A -> H = 90°, S = 0, I = 0.5 -> S=0, No Chrominance Information
Rule:
Chrominance Information depends on S.
Q3(b): Which Color has a Stronger Luminance (Intensity of Light)?
Color B -> I = 0.5
Color C -> I = 0.5
All color has a same Luminance.
Rule:
Luminance depends on I.
Q3©: What’s the predominant spectral color of Color C?
Value of R, G, B, C, M, Y of Color C:
R = 0.3, G = 0.7, B = 0.5, C = 1-R = 0.7 , M = 1-G = 0.3 , Y = 1-B = 0.5
max(R,G,B,C,M,Y) = 0.7 (Green and Cyan)
Color C is biased to middle of Green and Cyan ( or just Green among R, G and B).
Rule:
Predominant spectral color depends on H or simply watch the RGB value.
Q4: The bit depth of a grey-level image is 4. Mark all its possible pixel colors in the RGB space.
coordinates
The RGB coordinates are in a form of (, , ) where
Rule:
Grey-level means RGB are the same value.
The bit depth determines the number of pixel colors.
Tuto 2
The first step is to reconstruct them and to 4:4:4.
:
:
4:1:1
Note the numbers are rounded up. (Example 0.5 -> 1)
4:2:0
Note the numbers are rounded up. (Example 0.5 -> 1)
Median Filtering of Y
Note the numbers are rounded up. (Example 0.5 -> 1)
Apply a 3x3 filter to get all the values inside the mask. For outbounded window, we ignore the number.
During Median Filtering, the sequence are:
(24,45,65,81) (24,25,45,65,65,81) (23,25,65,65,81,99) (23,25,65,99) (24,27,32,45,65,81) (24,25,27,32,45,65,65,81,89) (10,23,25,27,65,65,81,89,99) (10,23,25,65,89,99) (15,24,27,32,34,81) (15,24,27,32,34,65,81,89,134) (10,23,27,34,65,81,89,99,134) (10,23,65,89,99,134) (15,27,32,34) (15,27,32,34,89,134) (10,23,27,34,89,134) (10,23,89,134) Therefore after median filtering, the output of the Y plane is:
Area of = Area of
Let be the area,
To find the equation of :
Coordinates of : (0,2), (1,0)
y = mx + c
Sub(1,0) and m = -2 into equation, we get
0 = -2(1) + c
c = 2
y = -2x + 2
equation of
To find the equation of :
y = mx + c
Sub (1,2) and (0,0):
2 = m + c
0 = c
m = 2
equation of
Then carry the integration
Originally, 3 bit (0-7)
now increasing resolution to 8bit(0-255)
However, we are equalizing its histogram simultaneously at the same time
To get total probability = 1, We need the cumulative frequency count.
Then (cumulative frequency count / total count) -> Normalization
Finally use the normalized value to get the output level.
input level to output level:
0 : (Rounded off)
1 :
2 : (Rounded off)
3 : (Rounded off)
4 : (Rounded off)
5 : (Rounded off)
6 : (Rounded off)
7 :
Therefore:
Tuto 3
Q1: Determine which of the following operators is for detecting horizontal(vertical) lines.
Rule:
How about other angles?
- Horizontal line + 45° line = 22.5°
- Horizontal line + -45° line = -22.5°
- Vertical line + 45° line = 67.5°
- Vertical line + -45° line = -67.5°
Q2:
- First count the frequency of all Intensity
- Use average intensity for initial thesholding value = 6.375
- Then split into 2 regions using the theshold, further count the frequency of all Intensity
- Compute a new threshold T =()/2 = 6.25
- Repeat until the difference in T in successive iteration is smaller than a certain value.
The final theshold = 6.25
Q3:
Maximum intensity difference in the region > 1 then split
Firstly, do the quad-tree decomposition
Then draw the complete quad tree: (Note you need to annoted the tree which region is which)
Merge the regions to form the final segmentation result (select the largest region first, from left to right). Use the average intensity of a region in merging the regions.
Therefore: Start from:
Then finally:
Q4:
- Seed point (Starting point) is selected by the user (usually highest or smallest intensity)
- Select a threshold
- Then start growing (follow checking order)
Tuto 4
Q1: Use the splitting technique to obtain a polygon for representing a circle. The splitting criterion is that the maximum distance of the boundary from the corresponding side of the approximated polygon is large than , where is the radius of the circle.
Splited once
Calculate
distance from the farthest boundary pixel =
threshold = (given in question)
Since distance from the farthest boundary pixel NOT > threshold, No further spliting
Q2:
(i)
By just looking at the first 3 components:A : (80,40,0)
B : (-20, 60, -20)
C : (0, 113, 0)
D : (35, 129, 0)
Compare to the fourier descriptors of a circle (0,40,0):
C is a likely a Scaled circle
A and D are likely a Translated circle (because DC component appeared (the first component))
(ii)
Used IDFT to calculate the Fourier Descriptor of Object B.
IDFT([-20,60,-20,20,-20,21,-20,20])
Coordinates are :
Used IDFT to calculate the Fourier Descriptor of Object D.
Final should be something like this:
(iii)
From the Graph, it is D.
From the diameter, it is also D.
Q3:
Correct Answer:
You may ask, why is it not a striaght line instead?
Because only if there are more than 1 shortest paths (shortest distance) from the pixel centre to the boundary, we mark it as medial axis (skeleton of the region).
So this one (the horizonal red line) is a wrong MAT.
Wrong Answer (and why it is wrong):
Q4:
Chain code starting at the top:
{67066434311}
First Do the difference, then do the starting point
Normalized with respect to the orientation of the object:
Consider the chain code {67066434311}
Therefore Difference = {51160671760}
Normalized with respect to the starting point = minimum Integer
Using the Normalized with respect to the orientation of the object {51160671760},
Therefore Normalized with respect to the starting point = {05116067176}
Tuto 5
Q1: A series of messages is to be transferred between two computers over a PSTN. The messages comprise just the characters A through H. Analysis has shown that the probability (relative frequency of occurrence) of each character is as follows:
A and B = 0.25,
C and D = 0.14,
E, F, G, and H = 0.055
Q1(a): By computing the entropy of the source, derive the minimum average number of bits per character.
Entropy, , of random variables , where , with the corresponding probabilities ’s is given by:
Then the entropy of these 6 inputs:
bits
Q1(b): Use Huffman coding to derive a codeword set.
we first sort them by Probability (From highest to lowest).
In each step, we combine the lowest probabilty with second lowest probability. Then sort it again to form different stage
And then we can start to assign codewords from shortest to the longest.
In the last stage, we assign a higher prob with 0, lower prob with 1.
Break it into 2, Assign 1 to high probability and Assign 0 to low probability
If the element is not alone, further break it until it is alone.
If the element is alone, the assigned bits is the codeword.
Q1©: Derive the average number of bits per character for your codeword set.
For each symbol:
Then add them all together.
bits/codeword
Q2:
Q2(a) find the output bitstream for each 4x4 block.
First apply zigzag scan to get the 1-dimensional array.
The first term is the DC DIFF, other terms are AC.
^if the last ACs are 0s, just denote as EOB
DC DIFF means (this Block DC - previous Block DC)
Then turn them into run-level representation.
For DC:
the run-level representation is (SIZE,AMPLITUDE).
For AC:
the run-level representation is (number of 0s, SIZE)(AC number).
^for EOB, just denote as (0,0)
Finally, map them according the table.
For DC:
AMPLITUDE as index => Match the DC Coding table
SIZE as binary (If SIZE is -ve, 1’s complement it)
For AC:
(number of 0s, SIZE) => Match the AC Coding tableAC number as binary (If AC number is -ve, 1’s complement it)
Then Join the DC with all the AC, you get the output bitstream.
Q2(b): Determine the compression ratio for each 4x4 block.
First find the No. of bits required for the original block
In this case, 4x4 block therefore 16x8 = 128 bits.
Then find the number of bits of bitstream (answer from part A) in each block
Finally find the Compression ratio.