Video

Video is an electronic medium for the recording, copying, playback, broadcasting, and display of moving visual media. Video was first developed for mechanical television systems, which were quickly replaced by cathode-ray tube (CRT) systems which, in turn, were replaced by flat panel displays of several types.

YouTube player

Video systems vary in display resolution, aspect ratio, refresh rate, color capabilities and other qualities. Analog and digital variants exist and can be carried on a variety of media, including radio broadcasts, magnetic tape, optical discs, computer files, and network streaming.

History

Analog video

NTSC composite video signal (analog) source – wikipedia

NTSC composite video signal (analog)
Video was invented decades after film, which records a sequence of miniature images visible to the eye when the film is physically examined. Video, by contrast, encodes images with electromagnetic waves.

Video technology was first developed for mechanical television systems, which were quickly replaced by cathode-ray tube (CRT) television systems. Video was originally exclusively a live technology. Charles Ginsburg led an Ampex research team developing one of the first practical video tape recorders (VTR). In 1951, the first VTR captured live images from television cameras by writing the camera’s electrical signal onto magnetic videotape.

Video recorders were sold for US$50,000 in 1956, and videotapes cost US$300 per one-hour reel.[4] However, prices gradually dropped over the years; in 1971, Sony began selling videocassette recorder (VCR) decks and tapes into the consumer market.

Digital video

Digital video is capable of higher quality and, eventually, a much lower cost than earlier analog technology. After the invention of the DVD in 1997, and later the Blu-ray Disc in 2006, sales of videotape and recording equipment plummeted. Advances in computer technology allow even inexpensive personal computers and smartphones to capture, store, edit and transmit digital video, further reducing the cost of video production and allowing program-makers and broadcasters to move to tapeless production. The advent of digital broadcasting and the subsequent digital television transition are in the process of relegating analog video to the status of a legacy technology in most parts of the world. The development of high-resolution video cameras with improved dynamic range and color gamuts, along with the introduction of high-dynamic-range digital intermediate data formats with improved color depth, has caused digital video technology to converge with film technology. Since 2013, the usage of digital cameras in Hollywood has surpassed the use of film cameras.

Characteristics of video streams

Number of frames per second

Frame rate, the number of still pictures per unit of time of video, ranges from six or eight frames per second (frame/s) for old mechanical cameras to 120 or more frames per second for new professional cameras. PAL standards (Europe, Asia, Australia, etc.) and SECAM (France, Russia, parts of Africa etc.) specify 25 frame/s, while NTSC standards (United States, Canada, Japan, etc.) specify 29.97 frame/s. Film is shot at the slower frame rate of 24 frames per second, which slightly complicates the process of transferring a cinematic motion picture to video. The minimum frame rate to achieve a comfortable illusion of a moving image is about sixteen frames per second.

Interlaced vs progressive

Video can be interlaced or progressive. In progressive scan systems, each refresh period updates all scan lines in each frame in sequence. When displaying a natively progressive broadcast or recorded signal, the result is optimum spatial resolution of both the stationary and moving parts of the image. Interlacing was invented as a way to reduce flicker in early mechanical and CRT video displays without increasing the number of complete frames per second. Interlacing retains detail while requiring lower bandwidth compared to progressive scanning.

In interlaced video, the horizontal scan lines of each complete frame are treated as if numbered consecutively, and captured as two fields: an odd field (upper field) consisting of the odd-numbered lines and an even field (lower field) consisting of the even-numbered lines. Analog display devices reproduce each frame, effectively doubling the frame rate as far as perceptible overall flicker is concerned. When the image capture device acquires the fields one at a time, rather than dividing up a complete frame after it is captured, the frame rate for motion is effectively doubled as well, resulting in smoother, more lifelike reproduction of rapidly moving parts of the image when viewed on an interlaced CRT display.

NTSC, PAL and SECAM are interlaced formats. Abbreviated video resolution specifications often include an i to indicate interlacing. For example, PAL video format is often described as 576i50, where 576 indicates the total number of horizontal scan lines, i indicates interlacing, and 50 indicates 50 fields (half-frames) per second.

When displaying a natively interlaced signal on a progressive scan device, overall spatial resolution is degraded by simple line doubling—artifacts such as flickering or “comb” effects in moving parts of the image which appear unless special signal processing eliminates them. A procedure known as deinterlacing can optimize the display of an interlaced video signal from an analog, DVD or satellite source on a progressive scan device such as an LCD television, digital video projector or plasma panel. Deinterlacing cannot, however, produce video quality that is equivalent to true progressive scan source material.

Aspect ratio

Comparison of common cinematography and traditional television (green) aspect ratios. Source – wikipedia

Aspect ratio describes the proportional relationship between the width and height of video screens and video picture elements. All popular video formats are rectangular, and so can be described by a ratio between width and height. The ratio width to height for a traditional television screen is 4:3, or about 1.33:1. High definition televisions use an aspect ratio of 16:9, or about 1.78:1. The aspect ratio of a full 35 mm film frame with soundtrack (also known as the Academy ratio) is 1.375:1.

Pixels on computer monitors are usually square, but pixels used in digital video often have non-square aspect ratios, such as those used in the PAL and NTSC variants of the CCIR 601 digital video standard, and the corresponding anamorphic widescreen formats. The 720 by 480 pixel raster uses thin pixels on a 4:3 aspect ratio display and fat pixels on a 16:9 display.

The popularity of viewing video on mobile phones has led to the growth of vertical video. Mary Meeker, a partner at Silicon Valley venture capital firm Kleiner Perkins Caufield & Byers, highlighted the growth of vertical video viewing in her 2015 Internet Trends Report – growing from 5% of video viewing in 2010 to 29% in 2015. Vertical video ads like Snapchat’s are watched in their entirety nine times more frequently than landscape video ads.

Color model and depth

Example of U-V color plane, Y value=0.5 Source – wikipedia

The color model the video color representation and maps encoded color values to visible colors reproduced by the system. There are several such representations in common use: typically YIQ is used in NTSC television, YUV is used in PAL television, YDbDr is used by SECAM television and YCbCr is used for digital video.

The number of distinct colors a pixel can represent depends on color depth expressed in the number of bits per pixel. A common way to reduce the amount of data required in digital video is by chroma subsampling (e.g., 4:4:4, 4:2:2, etc.). Because the human eye is less sensitive to details in color than brightness, the luminance data for all pixels is maintained, while the chrominance data is averaged for a number of pixels in a block and that same value is used for all of them. For example, this results in a 50% reduction in chrominance data using 2-pixel blocks (4:2:2) or 75% using 4-pixel blocks (4:2:0). This process does not reduce the number of possible color values that can be displayed, but it reduces the number of distinct points at which the color changes.

Video quality

Video quality can be measured with formal metrics like Peak signal-to-noise ratio (PSNR) or through subjective video quality assessment using expert observation. Many subjective video quality methods are described in the ITU-T recommendation BT.500. One of the standardized methods is the Double Stimulus Impairment Scale (DSIS). In DSIS, each expert views an unimpaired reference video followed by an impaired version of the same video. The expert then rates the impaired video using a scale ranging from “impairments are imperceptible” to “impairments are very annoying”.

Video compression method (digital only)

Uncompressed video delivers maximum quality, but at a very high data rate. A variety of methods are used to compress video streams, with the most effective ones using a group of pictures (GOP) to reduce spatial and temporal redundancy. Broadly speaking, spatial redundancy is reduced by registering differences between parts of a single frame; this task is known as intraframe compression and is closely related to image compression. Likewise, temporal redundancy can be reduced by registering differences between frames; this task is known as interframe compression, including motion compensation and other techniques. The most common modern compression standards are MPEG-2, used for DVD, Blu-ray and satellite television, and MPEG-4, used for AVCHD, Mobile phones (3GP) and the Internet.

Stereoscopic

Stereoscopic video for 3d film and other applications can be displayed using several different methods:[20][21]

Two channels: a right channel for the right eye and a left channel for the left eye. Both channels may be viewed simultaneously by using light-polarizing filters 90 degrees off-axis from each other on two video projectors. These separately polarized channels are viewed wearing eyeglasses with matching polarization filters.
Anaglyph 3D where one channel is overlaid with two color-coded layers. This left and right layer technique is occasionally used for network broadcast or recent anaglyph releases of 3D movies on DVD. Simple red/cyan plastic glasses provide the means to view the images discretely to form a stereoscopic view of the content.
One channel with alternating left and right frames for the corresponding eye, using LCD shutter glasses that synchronize to the video to alternately block the image to each eye, so the appropriate eye sees the correct frame. This method is most common in computer virtual reality applications such as in a Cave Automatic Virtual Environment, but reduces effective video framerate by a factor of two.

source – wikipedia