Verification & Validation

Verification

  • The process of evaluating a system/component to determine whether the products of a given development activity satisfy the conditions imposed at the start of that activity.
    • satisfying the conditions imposed?
  • Question at each step: “Are we building the product right?”

Validation

  • The process of evaluating a system/component during or at the end of the development process to determine whether the software does what the user really requires.
    • what the user really requires? Making sure it reflects what user want
  • General question: “Are we building the right product?”

The V & V process is a whole life-cycle process

  • V & V must be applied at each stage in the software process.
  • 2 Principal objectives
    • The discovery of defects in a system
    • The assessment of whether or not the system is usable in an operational situation.

Dynamic verification and Static verification

Dynamic V & V

Concerned with exercising and observing product behavior

  • Testing
  • Runtime verification, e.g., runtime monitoring

Static verification

Concerned with analysis of the static system representation to discover problems (i.e., without executing the program)

  • Inspections (manual analysis by a human)
  • Static analysis tools

Static verification

  • Verifying the conformance of a software system and its specification without executing the code
  • Involves analyses of source text by humans or software
  • Can be carried out on ANY documents produced as part of the software process
  • Discovers errors early in the software process
  • Is complementary to other verification methods, like testing.

effectiveness of Static verification

  • More than 60% of program errors can be detected by informal program inspections
  • More than 90% of program errors may be detectable using more rigorous mathematical program verification
  • The error detection process is not confused by the existence of previous errors
  • BUT: Combining static verification with testing is most effective
  • You cannot do without testing

Manual Inspections

Tradtional Inspections

  • Formalised approach to document reviews
  • Intended explicitly for defect DETECTION (not correction)
  • Defects may be logical errors, anomalies in the code that might indicate an erroneous condition (e.g. an uninitialised variable) or non-compliance with standards
  • Inspections can be used on any development artefacts (including requirements and design specs, code, and tests)

Example: Requirements Specification Inspections

Requirements specification inspection would check the document to make sure that the requirements are

  • Precise, unambiguous, clear, easy to read
  • Consistent
  • Relevant (relates to the problem)
  • Testable
  • Traceable
  • Complete
  • Free of implementation bias
  • Security, usability, performance

“Inspections” in Agile Development

Agile development includes practices focused on inspecting the quality of code and other deliverables

  • Code reviews
    • Every pull request should be reviewed by at least two other developers before merging into master
  • Task completion checklists
    • Checklist to verify that required quality control tasks have been successfully performed
  • Pair programming
    • One is coder and the other is reviewer

Static Analysis Tools

Automated analysis of the properties of a program without executing it

Applications:

  • Collecting information on program structures and dependencies
  • Collecting source code metrics
  • Checking adherence to programming standards and guidelines
  • Checking of correctness (verification)

Automated Static Verification

Control flow analysis

  • Locates poorly structured control flow, e.g., syntactically unreachable code due to goto
  • Modern languages aid in avoiding these problems

Data-flow analysis

  • Locates uninitialized variables and anomalies such as redundant variable updates
  • Detects serious errors and can be performed in reasonable time

Information flow analysis

  • Checking derive-dependencies between input and output data against a specification
  • Requires additional annotations in the code

Symbolic execution

  • Computing program properties using algebraic manipulation
  • Enables range checking, but is very time and memory consuming

Formal code verification

  • Checking code against a formal specification
  • Requires a formal specification

Limitations of Automated Static Verification

Control flow analysis, data-flow analysis and symbolic execution can be performed fully automatically and without any additional information beyond the code

  • Cost effective; tools are getting better (but some still expensive)
  • Can find only some categories of errors
  • Precision is usually not 100% (come conditions are identified as potential errors)
  • Should be seen as an addition to testing

Formal code verification

  • Highest level of completeness and precision
  • Only partially automated
  • Requires formal specification
  • Requires highly skilled personnel
  • Very costly
  • Practicable only on smaller and safety critical codes

Modern Code Quality Tools

  • Integration of automated checks into continuous integration and deployment
    • e.g.,static analysis of code to detect bugs,
    • code smells (e.g., duplicated code),
    • and security vulnerabilities,
    • and check adherence to coding standards,
    • and measure test coverage.

Introduction to Testing

Testing Teminologies

Program testing

  • reveal the presence of errors NOT their absence
    • Only exhaustive testing can show a program is free from defects. However, exhaustive testing for any but trivial programs is impossible
  • A successful test is a test which discovers one or more errors
  • Should be used in conjunction with static verification
  • Run all tests after modifying a system
    • Known as regression testing
    • If the test suite is huge, may perform change impact analysis and run the relevant subset of test only

Testing in V-model:

img

Testing stages

  • Unit testing
    • Testing of individual components
  • Integration testing
    • Testing to expose problems arising from the combination of components
  • System testing
    • Testing the complete system prior to delivery
  • Acceptance testing
    • Testing by users to check that the system satisfies requirements. Sometimes called alpha testing

Types of testing

  • Defect testing
    • discover system defects
    • successful defect test is one which reveals the presence of defects in a system.
  • Operational profile
    • Statistical tests designed to reflect the frequency of user inputs.
    • Used for reliability estimation and performance testing.

Some Terminologies

  • Failure
    • the external behavior does not conform to system spec
  • Error
    • a state of the system which, in the absence of any corrective action, could lead to a failure.
  • Fault
    • An adjudged cause of an error.

Testing & Debugging Activities

  • Defect testing and debugging are completely different processes
    • Defect testing is concerned with confirming the presence of errors
    • Debugging is concerned with locating and repairing these errors
      • Debugging involves formulating a hypothesis about program behaviour then testing these hypotheses to find the system error

Debugging Activities

  1. Locate Error & Fault
  2. Design fault repair
  3. Repair fault
  4. Re-test program

Testing Activities

  1. Identify: Test conditions
  2. Design: How the “what” can be tested: realization
  3. Implement: Implement test cases (imp. scripts, data)
  4. Execute: Run the system
  5. Compare: Test case outcome with expected outcome

Test conditions

  • What: Descriptions of circumstances that could be examined (event or item).
  • Categories: functionality, performance, stress, robustness, reliability, security/penetration, usability, …
  • Derive
    • Using testing techniques (to be discussed)

Testing Strategies

Execution of a test case against a program p

  • Covers certain requirements of P (functionality, quality);
  • Covers certain parts of P’s internal logic.

Using coverage we can guides test case selection

  • Use testing resources effectively
  • Maximize bug finding for a given test budget

Black-box testing

  • the program is considered as a ‘black-box’

  • The program test cases are based on the system specification

    • Deriving test from requirements, interfaces (we access to design or implementation)
  • Test planning can begin early in the software process

  • Possible test design strategy

    • Equivalence partitioning (feeding invalid inputs and valid inputs)
    • coverage of input partitions, or input property combination

White-box testing (structural testing / structural coverage)

  • Derivation of test cases according to program structure.

    • Deriving tests to cover design & implementation
  • Knowledge of the program is used to identify additional test cases

  • Possible test design strategy

    • Equivalence partitioning
    • Code coverage

Code coverage

  • statement coverage (All Nodes)
    • % of statements covered (aiming for 100%)
  • branch coverage (All Edges)
    • % of branches in the control flow (aiming for high%)
  • path coverage
    • % execution path in a program
    • hard to get high converage for non-trival programs

Example 1 of Code coverage

1
2
def abs(x: float) -> float:
return x if x >= 0 else -x

Statement coverage / branch coverage/Path coverage:

  • 2 tests cases (100%)
1
2
assert(abs(1),1)
assert(abs(-1),1)

Example 2 of Code coverage

A program finding quadent

1
2
3
4
5
6
7
8
9
10
11
12
13
def find_quadent(x: int, y: int) -> string :
x_pos = True if x >= 0 else False
y_pos = True if y >= 0 else False
if x_pos:
if y_pos:
return "I"
else:
return "II"
else:
if y_pos:
return "IV"
else:
return "III"

Statement coverage/ branch coverage/Path coverage:

  • need to cover all nodes (100%)
1
2
3
4
assert(find_quadent(1,1), "I")
assert(find_quadent(1,-1), "II")
assert(find_quadent(-1,1), "IV")
assert(find_quadent(-1,-1), "III")

Example 3 of Code coverage

1
2
3
4
5
6
7
8
9
def ifseq(x,y):
if x > 0:
print "x positive"
else:
print "x not positive"
if y > 0:
print "y positive"
else:
print "y not positive"

Statement coverage / Branch coverage:

1
2
3
4
5
6
7
8
9
variant 1:
----------
1 1
-1 -1

variant 2:
----------
1 -1
-1 1

Path Converage: 4 paths

1
2
3
4
1,1
-1,1
1,-1
-1,-1

Example 4 of Code coverage

1
2
3
4
5
6
7
def ifseq2(x,y):
if x > 0:
print "x positive"
if y > 0:
print "y positive"
else:
print "y not positive"

Statement coverage (All nodes):

1
2
1, 1
1, -1

Branch coverage (all edges):

1
2
3
4
5
6
7
Variant 1
1, 1
-1, -1

Variant 2
1, -1
-1, 1

Path coverage (all execution paths):

1
2
3
4
1,1
-1,1
1, -1
-1, -1

Example 5 of Code coverage

1
2
3
4
5
def find(arr, target):
for i in range(len(arr)):
if arr[i] == target:
return i
return -1

Statement coverage (All nodes):

1
2
[3, 4], 3
[ ], 1

Branch coverage (all edges):

1
2
[4, 3], 3
[ ], 1

Path coverage (all execution paths):

1
2
3
[3, 4], 3
[4, 3], 3
[ ], 1

Testing in the Development Process

Unit Testing

  • Objective: Find differences between specified units and their imps.
  • Unit: component ( module, function, class, objects, …)

Integration Test

  • Objectives:
    • To expose problems arising from the combination
    • To quickly obtain a working solution from comps.
  • Problem areas
    • Internal: between components
      • Invocation: call/message passing/…
      • Parameters: type, number, order, value
      • Invocation return: identity (who?), type, sequence
    • External:
      • Interrupts (wrong handler?)
      • I/O timing
    • Interaction:
      • Structural
        • Bottom-up: modules with no dependencies, then higher level modules
        • Top-down: top modules with stubs (early demo), then replace stubs with modules
      • Behavioral

System Test

  • Concerns with the app’s externals
  • Much more than functional
    • Load/stress testing
      • push it to its limit + beyond
    • Usability testing
      • Human element in system operation; GUI, messages, reports, …
    • Performance testing
    • Resource testing

Functional testing

  • Objective: Assess whether the app does what it is supposed to do
  • Basis: Behavioral/functional specification
  • Test case: A sequence of ASFs (thread)

Acceptance Test

  • Purpose: ensure that end users are satisfied
  • Basis: user expectations ( documented or not)
  • Environment: real
  • Performed:for and by end users (commissioned projects)
  • Test cases:
    • May reuse from system test
    • Designed by end users

Regression Test

  • Whenever a system is modified ( fixing a bug, adding functionality, etc.), the entire test suite needs to be rerun
    • Make sure that features that already worked are not affected by the change
  • Automatic re-testing before checking in changes into a code repository
  • Incremental testing strategies for big systems

Exercise questions


Topic Summary:

Purpose of Testing: reveal the presence of errors NOT their absence

Unit Testing: Objective: Find differences between specified Unit; localize bugs

Integration Test: Objective: To expose problems arising from the combination

System Test: Target System Design

Acceptance Test: Target User requirements; ensure that end users are satisfied

Regression Test: Entire test when system is modified

Black-box Testing: Deriving test from requirements, interfaces

White-box Testing: Deriving tests to cover design & implementation

Structural coverage is the coverage of design and code (white-box)

Code coverage:

  • Statement coverage: all nodes

    • need to make sure all the statements (i.e. lines of code) are executed at least once in statement coverage
  • Branch coverage: all edges

    • All possible values of the constituents of compound conditions are exercised at least once.
  • Path coverage: all execution paths

    • cover all feasible paths

Which statement is correct? Mark both if both correct.

  • Testing can show the absence of bugs.

Not Correct. It should be presence, no absence.

  • Testing can show the presence of bugs.

Correct.

  • Testing improves software quality.

Correct.

Give an example of a black-box coverage metric.

Coverage of requirements (or coverage of input partitions, or input property combination)

What is the difference between acceptance and system testing?

Acceptance testing targets the user/customer requirements;

System testing is informed also by the system design

What is the key benefit of unit testing as compared to system testing?

Unit testing find differences between specified Unit, so the bugs are localized

What is the difference between black-box and white box testing?

Black-box Testing: Deriving test from requirements, interfaces

White-box Testing: Deriving tests to cover design & implementation

Black-box tests are designed wrt. requirements and interfaces; white-box tests take design and implementation into account.

What is structural coverage?

structural coverage is the coverage of design and code (white-box)

Consider the following sample program, draw its control-flow graph and give a test vector (Boolean assignments to A,B,C; and a natural number to N) set for statement coverage, branch coverage, and path coverage (for simplicity, consider one iteration at maximum). An example of a test vector is A, not B, ?, ? (the question mark indicates “don’t care” for the respective input). The test vector sets should be minimal (i.e., if any of the vectors is removed from a set , the respective coverage is not achieved).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Input: A,B,C,N
If A then
If B then
S1
Else
S2
End if;
S3
Else
If C then
S4
End if;
S5
For i:=0;i++;i<N do
S6
End for;
End if

img

Consider the following sample program, draw its control-flow graph and give a test vector (Boolean assignments to A,B; and a natural number to N) set for statement coverage, branch coverage, and path coverage (consider up to one unrolling of the loop for path coverage). An example of a test vector is A, ?, ? (the question mark indicates “don’t care” for the respective input). The test vector sets should be minimal (i.e., if any of the vectors is removed from a set , the respective coverage is not achieved).

1
2
3
4
5
6
7
8
9
10
11
12
Input: A,B,N
If A then
S1
Else
S2
For i:=0;i++;i<N do
S3
End for;
If B then
S4
End if;
End

img

Consider the following sample program, draw its control-flow graph and give a test vector (Boolean assignments to A,B,C,D) set for statement coverage, branch coverage, and path coverage (consider up to one unrolling of the loop for path coverage; for simplicity, assume that S5 sets C to false). An example of a test vector is A, not B, ?, ? (the question mark indicates “don’t care” for the respective input). The test vector sets should be minimal (i.e., if any of the vectors is removed from a set , the respective coverage is not achieved).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Input: A,B,C,D
If A then
If B then
S1
Else
S2
End if;
S3
Else
S4
While C do
S5
End while;
If D then
S6
End if;
End

img