Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Excerpt
Table of Contents
maxLevel5
indent20px
absoluteUrltrue

Image Quality

Understanding what constitutes quality images and how to optimize the quality of images used in 3VR systems is critical to developing viable solutions. Meeting specific requirements for 3VR facial surveillance is much more challenging than in any traditional CCTV deployment. Accordingly, many partners and users are unaccustomed to being concerned about these issues and can easily overlook them.

By understanding and optimizing image quality, you will be able to:

  • Better qualify what opportunities are strong fits for 3VR

  • Set the appropriate level of expectations with partners and users

  • Design a system that accommodates real world conditions

  • Deliver a solution that is optimally effective

Key Elements in Optimizing Images for Facial Surveillance

This report will examine and explain the many elements that are critical to using 3VR for facial surveillance. While fundamental principles apply to images from photographic cameras as well as CCTV cameras, the application will differ. The following summarizes the key elements so that the reader may have a quick reference for future use.

  • Any image, whether video or photo, requires sufficient detail. Detail is determined by

    • the level of resolution in the image

    • the size of the person’s face relative to the size of the image

  • The person’s face must look directly towards the camera. This affects both the class of imported photos that are acceptable and how video cameras must be positioned to capture faces.

  • Imported images have to meet the requirements listed above. With the exception of mugshots and passport photos, most photos do not meet these requirements.

  • Video cameras must be positioned specifically to capture faces. The cost may be inexpensive but the skill is not
    trivial. Care must be taken to precisely position cameras to capture faces consistently.

Imaging Background

Overview

Two aspects of imaging are most important in understanding and optimizing 3VR facial surveillance:

  1. Resolution

  2. Field of View

Without understanding the impact of these two aspects, it will not be possible to master use of 3VR for either importing images or capturing video. As such, prior to analyzing facial surveillance or its application to 3VR, these aspects shall be explained and relevant
introductory material shall be presented.

Resolution

Resolution, as applied to images used in the 3VR system, is defined as the level of visual detail in the image.

Resolution is commonly defined in two dimensions: horizontal and vertical. For instance, a frequent resolution level cited is 640 x 480 pixels. This means that there are 640 unique pixels across the image horizontally and 480 pixels down the image vertically.

The number of horizontal pixels is important for performing 3VR facial surveillance because it determines the amount of detail available for performing facial analysis. Sufficient horizontal pixels are required to perform facial recognition.

One standard metric used in describing video surveillance images is the Common Intermediate Format (CIF). This is a way to quickly
cite specific resolution levels that are commonly used in digital video. CIF specifies specific horizontal and vertical resolution levels.

The following table provides examples of different CIF levels.

CIF Level

Resolution

Example

Quarter CIF

176 x 44

CIF

352 x 288

Typical Internet streaming

2CIF

704 x 240

4CIF

704 x 576

NTSC camera max resolution

16CIF

1408 x 1152

1.5 Megapixel camera

Industry participants commonly cite different CIF levels. These CIF levels have a significant impact on whether imported images or captured video can be used for facial analysis.

To explain further with an example, an image of a person recorded with 4CIF resolution may be able to be used by the 3VR system for facial analysis, however, the same image recorded at CIF resolution may not contain enough detail to be used for facial analysis unless the face is extremely large (at least a quarter of the width of the field of view). See “Camera Placement” on the following page for more information and examples on field of view for facial analysis.

Field of View

Need to Multi Excerpt content from other pages.

Camera Placement for Facial Analysis - 4 Factors

Field of View Determines Size and Resolution of Face

Facial analysis requires a certain minimum resolution level to be effective. This resolution level is measured in pixels.

  • 3VR requires a minimum of 35 horizontal pixels between the eyes (or about 80 - 100 horizontal pixels across the head) to perform facial analysis

  • 3VR performs analysis of all analog NTSC video at 4CIF (704 x 576 pixels)

Given these facts and that the average width between eyes is 3”, and the width of a head is approximately 6 - 7”, an NTSC camera at
4CIF resolution can capture faces in a field of view (FOV) no more than about 4.5 feet.

Tip

Place a person standing in the foreground in optimal focus. When they hold their arms stretched out from side-to-side,
you should not be able to see their hands (the image should be cut off at their wrists). If you can see their hands in the
image when they are standing in focus in the foreground, the field of view is too wide for facial analysis.

If you are looking at prerecorded images from a camera already placed, measure the width of the head, and if it is smaller than 1/7 (about 15%) of the field of view, the field of view is too wide for facial analysis.

Poor FOV

Feet

Face

5.5

1/9 of the image

Max Acceptable FOV

Feet

Face

4.5

1/7 of the image

Excellent FOV

Feet

Face

3.5

1/6 of the image

Horizontal Angle

Facial analysis requires a clear image of the full face, directly facing the camera - with minimal turning of the head to the left or right
(horizontal angle) relative to the camera.

Tip

The image must simultaneously show both ears of the subject. If one of the ears is not visible, the horizontal angle is too
extreme.

Image Added

Vertical Angle

Facial analysis requires a clear image of the full face, directly facing the camera - with minimal tilting of the head up or down (vertical
angle) relative to the camera.

Tip

The middle of the nose should be higher than or at least the same level as the bottom of the earlobes. If the middle of the
nose appears below the earlobes, the vertical angle is too high.

Cameras need to be mounted low enough or far away enough so that the vertical angle or slope does not exceed 20% above the eye level when subjects are in focus in the foreground. Given an average eye height of 5 feet, a camera 10 feet away can not be mounted higher than 20% of 10 feet (2 feet) above the eye height of 5 feet - so not higher than 5 + 2 = 7 feet. A camera 20 feet away can be mounted as high as 9 feet (20% of 20 = 4 feet above 5 feet, 4 + 5 = 9). See “Determining Camera Mounting Height” later in this section for more details.

Image Added

Lighting Level

Facial analysis requires even levels of lighting that clearly shows the detail in a face. Facial analysis requires lighting conditions that do not produce shadows and/or dark areas in the face (underexposure) and lighting conditions that do not produce glare and/or washed-out areas in the face (overexposure). A face with lots of detail visible and a wide range of dark and light pixels (referred to as “dynamic range”) is required for facial analysis.

  • Photo A –Good Lighting. There are no areas of the face in shadow or glare. There is wide dynamic range - lots of both light and dark areas within the face, and lots of detail is visible.

  • Photo B –Overexposed. There are no areas of shadow, but there are significant areas with glare (notice the cheeks, nose and
    forehead). There is a narrower dynamic range - excessive washed-out areas with loss of detail.

  • Photo C –Marginally acceptable. There are some areas of the face in mild shadow and the face appears somewhat darker than
    desired. There is marginally acceptable dynamic range - moderate amounts of both light and dark areas within the face, and
    moderate amounts of detail are visible.

  • Photo D –Underexposed. There are substantial areas of shadow and/or not enough light. There is a narrower dynamic range -
    excessive dark areas with loss of detail.