Video management system

A video management system, also known as video management software plus a video management server, is a component of a security camera system that in general:

Collects video from cameras and other sources
Records / stores that video to a storage device
Provides an interface to both view the live video, and access recorded video

A VMS can be the software component of a network video recorder (NVR) and digital video recorder (DVR), though in general a VMS tends to be more sophisticated and provide more options and capabilities than a packaged NVR device.^[1]

Due to improvements in technology, it is necessary to make a distinction between a VMS and the built-in features of modern network based security cameras. Many modern network cameras offer internal capabilities to record and review video directly themselves via a web browser and without the use of a VMS. However a camera's built-in web interface is typically exclusive to the camera itself and does not normally provide a shared access capability across other network cameras.^[1]

Additional capabilities

Optionally, a VMS may also provide additional features and capabilities. The extent of these capabilities may be divided across several product tiers, with the lower cost VMS products having fewer features.

Motion detection

Rather than continuously recording data, a VMS may also implement motion detection to reduce the amount of data to be recorded.

In older analog security camera systems, the cameras were "dumb" devices only capable of producing a video signal continuously, and any video signal processing had to be done by the recording VMS. With modern megapixel network cameras, the cameras are much more sophisticated.

Motion detection can now be distributed, so that the cameras do motion detection themselves and only send video when motion is detected.

Alternately, motion detection can still also occur in the VMS. Some VMS do not have motion detection and rely exclusively on it being done in the camera.

A difficulty with modern megapixel network cameras is the variety of standards and industry compliance with those standards. If a VMS does not have built-in motion detection, and a camera that is only accessible via ONVIF does not expose a motion control interface, and the camera manufacturer has not provided a custom API to the VMS authors, then motion detection will not be possible even if the camera is theoretically capable of it.

Distributed processing

For a very large and complex security camera system, there may be too many cameras, too much network bandwidth, too much data to be analyzed, or too much storage required for a single server device to handle the workload.

In this case the workload is divided across multiple server devices, each handling a slice of the overall workload.

The VMS provides a single management interface allowing clients to access camera sources across all servers, making them appear to be a unified collection rather than isolated on multiple independent sources.

This functionally may be reserved for the higher-end or more expensive VMS product options, and may not be available from a low-cost VMS.

Audio

A VMS can also be capable of recording audio from network cameras, and may in some cases provide two-way audio through a network camera, acting as an intercom.

Typically this requires an external amplifier and speaker by the camera. Some network cameras include a built-in microphone, or may provide external audio I/O connections.

Alarm I/O

A VMS may provide the ability to monitor alarm inputs and act on them in some manner, including:

Sending alarm outputs to activate ancillary equipment such as lighting
Beginning recording on one or more camera sources
Sending an alert message to one or more people, via email, cellphone SMS, or over the Internet to a client application or mobile phone app.

Alarm inputs and outputs can be handled through separate interface components such as:

Computer expansion cards
Multipoint long-distance serial interfaces such as RS-422 or RS-485
Directly integrated into network-based cameras

Pan tilt zoom control

A VMS may also provide the ability to remotely control pan-tilt-zoom (PTZ) cameras, which can be remotely rotated, titled, and zoomed, thereby allowing a single camera to monitor a very large area while also providing detailed views of specific areas of interest.

PTZ itself can be implemented as:

A real analog motion control, driving physical motors in the camera device
A digital translation of a fixed camera view, to zoom in on the image and pan that close-up view around the zoomed image.
Both the analog and digital PTZ can be combined together, potentially with a combined control system that is at first analog, but switches to digital once the optical zoom-in limit has been reached.

Digital PTZ has become very common as network cameras have increased in resolution to beyond 1080P. It is no longer possible to directly view all the pixels of some high resolution cameras even with a 4K computer monitor, and digital zooming is required to see the fine detail being captured by the camera.

Digital PTZ has the potential to reduce equipment maintenance and failure, by replacing a physically moving camera with a fixed very high resolution camera. Moving cameras have a tendency to fail over time due to wear of the drive motors, belts, and bearings. They may also be sensitive to temperature changes, and fail to function at temperature extremes. A fixed position camera removes these components, relying solely on digital translation of the high detail camera image.

Fixed-view fisheye cameras have a bowl-shaped 360-degree view. When mounted overhead pointing straight down, part of the viewed space appears sideways or upside-down on the VMS. For these cameras, the digital PTZ may also include a rotation feature to digitally rotate the view so that all zoomed-in viewed areas appear right-side-up.

License plate detection / License plate recognition

A VMS can optionally provide the ability to locate license plates in its view and capture the plate information from the image, as a form of optical character recognition.

For fixed-location cameras, these numbers are stored in a database along with the time it was captured, and used in combination with many other cameras to create a geographical time plot for where plates are seen. License plate readers may be used to anonymously track the location of vehicles through the course of many days, to build a profile of vehicle usage and activity.

For mobile cameras, license plate detection works similarly as described above, though using GPS to log where a plate was seen and when. The VMS also provides on-the-fly data to monitor surrounding vehicles on the road, and look up the vehicle details such as registration or potential criminal activity.

Hybrid analog / digital recording

An organization can have a significant investment in older analog (NTSC/PAL/SECAM) cameras and associated cabling and power infrastructure. The organization may decide to keep using the older cameras in some locations, rather than replace everything with new higher detail network cameras.

For example, a low-resolution analog camera in a little-used storeroom may suffice for the task, and not need the expense of a high detail digital camera and the infrastructure costs to install and use the new camera.

A hybrid system provides for a lower cost transition between analog and digital cameras, allowing the VMS to accept input from either video source type. A hybrid system may use internal multi-input capture cards or external video encoder devices.

Point of Sale integration

A VMS may offer the ability to be linked to the output of an electronic cash register, displaying the information printed on a sales receipt as text overlaying the camera image. This provides a visual record of the sale, and tracks mistakes or potential theft by employees.

Fisheye dewarping

As of 2016, this is still a very new component of video management systems and is not yet widely deployed or consistently implemented.

A fisheye camera has a special lens that typically has a 180 degree field of view and can see 360 degrees around the lens. When mounted flat on a ceiling, it is possible for a single fixed camera to see the entire space below it without moving. However the spherical view causes angular distortion of straight lines, giving objects a strange bulged and deformed appearance.

Fisheye dewarping is a technique used by a VMS to take the output of a fisheye lens and mathematically correct the deformed image so that lines appear straight again, and objects look normal. The image is also typically rotated so that all portions of the view appear right-side-up.

There are several competing standards for dewarping. Some manufacturers such as Oncam and Panasonic have developed their own custom techniques, and need to provide decoding libraries to the VMS programmers to support their cameras.

Alternately a high performance dewarping method called Immervision has been developed, which also makes use of a special lens geometry to redistribute pixels in a more efficient manner. It has been licensed to camera manufacturers with fixed lens assembles, and can also be implemented on box-style cameras that can accept a special lens assembly compatible with Immervision.

Dewarping may not be present in all VMS, and even among VMS that do advertise the capability, some VMS may not be compatible with all cameras due to a lack of decoding libraries for specific camera models and manufacturers.

It can also present a situation where dewarping works with a VMS, but due to a lack of in-camera motion detection support, the only option available is 24-hour recording or scheduled time-period recording.

Finally, dewarping is a computationally intensive task. Although multiple views of a single fisheye camera are possible, the combined processing to dewarp multiple high resolution views can overload the viewer computer.

Single recorded stream, multiple views

A feature of some newer VMS is the capability to show multiple camera views from a single recorded stream. This utilizes digital PTZ of high megapixel cameras, and may also be referred to as client-side dewarping for fisheye cameras.

A single camera with a very wide or high resolution field of view is capable of covering two or more areas of interest. This single datastream is recorded only once, but then decoded multiple times by the client viewer software, zooming in on the separate areas of interest while still only utilizing one camera datastream.

This can significantly reduce data storage requirements, where two or more separate cameras would have been used previously. In the case of fisheye cameras, it is possible for one camera to replace 10 or more separate camera views, while only recording the one original panoramic fisheye view.

References/Citations

^ ^a ^b "Video Management – Avilion-int". Retrieved 2022-04-16.

[:0-1] "Video Management – Avilion-int". Retrieved 2022-04-16.

[1]