Ardent Tool of Capitalism

The go-to place for all IBM PS/2 and Micro Channel enthusiasts

Capturing Digital Video Using DVI

Multimedia and the i750 video processor

Source: Dr Dobb's Journal, July 1992 (original HERE)
Author: James L. Green
This article contains the following executables: AVKCAPT.ZIP

Touring the AVK
Capturing Digital Video
Building the Recorder
Controlling the Recorder
Writing Compressed Data to Disk
Conclusion

Data Compression and the AVK
Compressing Data
Data Streams

Listing One Windows AVK Capture Program - AvkCapt.h (?)
Listing Two Windows AVK Capture Program - Create Recorder
Listing Three Windows AVK Capture Program - Recorder Control
Listing Four Windows AVK Capture Program - Write Captured Data to Disk

The DVI multimedia tools, developed by Intel and IBM, provide application developers and users with a highly integrated set of multimedia capabilities. The ActionMedia II delivery board available for ISA and Micro Channel bus PCs can be used to play digital audio and video data on desktop PCs running DOS, Windows, or OS/2. ActionMedia II cards utilize the i750 video processor to perform real-time encoding and decoding of digital video images and come configured with two megabytes of video memory (VRAM). The system software used to enable these capabilities under Windows and OS/2 is called the "audio video kernel" (AVK). AVK provides control over digital multimedia elements such as audio, video, and still images.

By attaching the optional capture module to the Action-Media II delivery card, applications can capture and compress audio and video data. All of the analog signals (both audio and video) enter the system via an 8-pin mini-DIN connector located on the delivery board. A variety of video signals (Y-C, RGB, and Composite) are supported, as well as stereo audio. The capture subsystem performs analog-to-digital conversion of the source signal and deposits the data into VRAM. Digitizing, compressing, and displaying are independent events under the control of the software. This enables a variety of data-flow scenarios. For example, the data can be digitized and displayed without compression (laser-disc emulation), or the data can be digitized, compressed, and stored (or transmitted) without displaying (video mail/teleconferencing).

Touring the AVK

AVK is a set of OS-independent, dynamically linked libraries that provide applications with a collection of components similar to those found in a recording studio. These objects can be configured in various ways for manipulating multimedia data. All AVK function calls take the form AvkObject-Method(Params). The objects defined in the AVK programming interface are: groups, buffers, streams, views, images, and connectors.

An AVK group is the unit of control synchronization and is analogous to the tape-transport functions of a tape deck. Group calls include starting, pausing, and recording. A group buffer is the digital representation of a tape; an area in VRAM used as a temporary repository of compressed audio and video data. Since the audio and video data is often interleaved, a group buffer can contain multiple streams of data as long as they all play at the same rate, just as all the tracks on an analog tape must pass the tape heads at the same rate.

A stream is analogous to a track of audio or video data. While a motion-video sequence is physically delivered as a series of consecutive frames, it can be viewed as a logical stream of data. A video stream is implemented as a circular array of bitmaps. While capturing, the digitizer on the capture module places each frame into one of the bitmaps, while the encode task running on the i750 video processor compresses each frame and places it into the group buffer. The audio data is handed to an audio DSP for encoding before it is placed into its own group buffer.

Another AVK object, called a "view," implements the notion of a video monitor. A view is a special kind of bitmap that can genlock to the host display system, allowing DVI video and standard VGA/XGA graphics to be mixed on a pixel-by-pixel basis. Views also include a collection of rectangular visual regions called "boxes" which are mapped into windows by the application. Video streams and still images are typically the sources of these visual regions. The concept of the view is analogous to a visual "mix." Applications can create and maintain multiple views and select the view to be monitored on the display.

If the group is a tape deck, and the view is a monitoring system, then there needs to be a way to connect them. This is handled by an AVK object called a "connector," analogous to a channel on an audio/video mixing board. It has an input ("source"), an output ("destination"), and parameters for altering the data in real time. Video streams, images, views, and the digitizer can all be connected in various configurations depending on the application's requirements. At its simplest, the connector is a higher-level abstraction of a bitmap copy operation. Connectors allow boxes to be defined for the source and destination bitmaps. The size of the boxes can be modified in real time to allow resizing and relocating of images to support windowing. Connectors behave differently, depending on which objects are used as their source and destination. For example, if the source is a video stream and the destination is a view, the connector will copy each frame automatically based on the frame rate. If the source is an image and the destination is a view, the connector will perform a single copy. Connectors also provide control for scaling, cropping, and adjusting the tint, contrast, saturation, and brightness of the image.

Capturing Digital Video

AvkCapt is a Windows program that captures video and audio from an analog source using Intel's ActionMedia II board set. AvkCapt allows you to monitor the analog source and capture the audio/video data to a file. You can enter a filename, turn monitoring on and off, and turn capturing on and off by making selections from pull down menus. When you begin monitoring, AvkCapt digitizes the data, sending the audio out to the speakers (attached to the delivery board) and the video to the computer's display screen. When capture is toggled on, AvkCapt begins compressing the incoming data and writing it out to a file.

The audio and video data is compressed using different algorithms (see the sidebar "Data Compression and the AVK"). The video is compressed using the real-time video (RTV 2.0) algorithm at a resolution of 128x240 pixels by 30-frames per second (NTSC) or 128x244 pixels by 25-frames per second (PAL).

The AVK function AvkDeviceVideoIn() allows the application to determine the type of source video. RTV doubles the number of horizontal pixels on playback, resulting in a 256x240 (NTSC) or 256x288 (PAL) video. Using NTSC as the example, if this video is displayed using a 512x480 view, the result will appear in a quarter-screen video window. If a 256x240 view is used, the video will appear full screen. However, since the horizontal resolution of the capture stream is only 128 pixels, we can't show the monitored video at full-screen size on the 256x240 view. This is because the current version of AVK can't scale the video up in real time using a connector. In AVK, if the resolution of the destination is larger than the source, the video will be displayed in the upper-left corner of the destination box. Therefore, AvkCapt uses a fixed window of 128x120 pixels to display the monitored video.

The source to the AvkCapt program is longer than can be reproduced here. I can, however, describe the three main aspects of the program apart from the GUI interaction: configuring the AVK objects for recording audio and video data, controlling the flow of data through the system, and writing the compressed data to disk.

More Details.

Building the Recorder

As described above, AVK provides a collection of components that can be configured in a variety of ways. For our purposes, we need to build an audio/video recorder.

Figure 1 illustrates the configuration used by the AvkCapt program. In any AVK program, the first steps include initiating an AVK session and opening the ActionMedia II board. The function InitAvk() in Listing Two, page 90 begins an AVK session with a call to AvkBeginMsg(). This function takes the application's window handle as one of its parameters. AVK will send messages to this window to notify it of various events. Next the board is opened for the exclusive use of the application with a call to AvkDeviceOpen(), and finally, a request is sent to the device to identify the capture sync of source video connected to the digitizer. AVK will respond to this request by sending an AVK_IDENTIFY message to the application window. The capture sync will be returned as part of the 32-bit parameter to the message. The application's main window procedure intercepts this message, and the capture sync is passed as a parameter to the CreateAvkResources() function. This function builds the recorder by creating and formatting the appropriate AVK objects.

CreateAvkResources() calls a number of other functions that do the real work. The GetDevCaps() function uses AvkGet-DevCaps() to retrieve the device capabilities from the AVK.INI file. One of the attributes retrieved by this call is the DviMonitorSync, which is used to decide on the type of the AVK view and the x and y resolutions. Once this value is known, the attributes of the view-control structure can be defined. Since more than one monitor choice can be bitmapped in DviMonitorSync, we default in whatever order most suits the specific application's needs. In this case, we let VGA take precedence over XGA if both are indicated, and either VGA or XGA over either PAL or NTSC. We then calculate the screen-to-AVK coordinate-conversion deltas. These deltas will be used to convert from
the native-screen resolution to the AVK-view resolution. For example, given a VGA-screen resolution of 640x480 and an AVK View of 256x240, we convert an x coordinate with the formula: Xavk = (int) ((double) Xscreen* (256.0/640.0)). This is necessary because the video pixels have a 5/4 aspect ratio. (There are five video pixels for every four VGA pixels.)

Once the view parameters have been defined, the Create-View() function is used to create and display an AVK view by calling AvkViewCreate() and AvkViewDisplay(), respectively. RTV uses a YUV9 bitmap format (YUV color space with 4-1-1 subsampling). The view is initially displayed as a black rectangle. Finally, a call to SetDstBox() sets the destination-box coordinates for the stream-to-view connector according to the coordinates of the main window's client rectangle.

The LoadVshFile() function loads data used by RTV during the compression process. (A discussion of the VSH data is beyond the scope of this article.) Now that we have identified and configured the source for the data (the digitizer) and the destination (the view), we need to create a capture group. As shown in CreateCaptureGroup(), two group buffers are created--one for audio and one for video.

AVK requires that separate buffers be used when capturing data, although this data is typically interleaved together when it is written to disk. On playback, the group buffers can be configured to contain multiple streams of data. This allows the interleaved files to be played as is, without the application having to parse the data back into separate streams. In addition to the group buffers, which use video RAM on the ActionMedia II board, buffers in host RAM are also created for holding video and audio frames while they are being written to disk.

CreateVideoStream() creates and formats a video stream for the video-capture buffer. The RTV encoding parameters Rtv20Args and the x and y resolutions for the specific capture sync are passed along with the VSH data read in by LoadVshData(). Once the video stream has been formatted, the area of memory used to store the VSH data can be discarded, since AvkVidStrmFormat() makes its own copy.

The last step in building the recorder is to create the connectors from the digitizer to the video stream and from the video stream to the view. These connectors act like the channels in a mixing console, allowing us to control the flow of data from one place to another. When the connector from the digitizer to the video stream is enabled, capture data will begin to flow from the board to the stream. When the connector between the video stream and the view is enabled, the captured data will begin to flow from the video stream to the view and will appear on the screen in the rectangle defined in View.DstBox. Creating the audio stream is more straightforward: Simply create the stream and format it with the frame rate, sample rate, and algorithm name (in this case ADPCM4).

Closing the AVK session is a simple matter of calling AvkEnd() and freeing up the memory used for the host I/O buffers. While there are calls in AVK to explicitly deallocate AVK objects, AvkEnd() will implicitly destroy all created objects.

Controlling the Recorder

There are two types of control objects in the AVK library--connectors and groups. Groups control the flow of data through the compression/decompression process, and connectors control the flow of visual data through the monitoring system. These data flows are shown in Figure 2.

AvkCapt defines four states that determine its behavior: uninitialized, initialized, monitoring, and capturing.

AvkCapt uses the three functions ToState(), IsState(), and GetState() shown in Listing Three , page 94, to alter and query the current state. These states are used by the application to control the menu options available to the user.

The ToggleMonitor() and ToggleCapture() functions (also shown in Listing Three) illustrate how AvkCapt controls the connectors and groups by changing state. In ToggleMonitor(), if the current state is initialized and monitoring is off, we turn it on. If monitoring is on and we are not capturing, we turn it off. The functions MonitorOn() and MonitorOff() are used to do the real work.

MonitorOn() uses the AvkConnEnable() function to enable the connectors from the digitizer to the video stream and from the video stream to the view, causing video to be displayed, and it turns the audio on by calling the AvkDeviceAudioIn() function. Monitor-Off() turns off the flow of video by hiding the connectors. AvkConnHide() paints the key color (black) into the connector's destination and then disables the connector. Another call to AvkDeviceAudioIn() turns off audio monitoring.

ToggleCapture() toggles the capture state on or off (assuming a file has been opened to receive the captured data). If we are monitoring, we turn on capture by starting the group. If we are already capturing, we turn it off by pausing the group. For any other state, we simply return without doing anything.

Writing Compressed Data to Disk

The most difficult part of capturing the incoming digitized data is keeping up with it. If AvkCapt does not read the frames from the VRAM buffers fast enough, the frames will be lost, and a series of blank frames will have to be inserted to take their place (in order to keep the frame rate constant). This will cause skipping effects on playback. On the other hand, if too much time is spent retrieving data, the message loop may not respond promptly and mouse action may be degraded.

AvkCapt illustrates two different approaches to retrieving data in a timely manner. The first
involves calling a read routine each time AvkVCapt receives a AVK_CAPTURE_DATA_AVAILABLE message from AVK informing it that a designated amount of data (called the "hungry granularity") has been captured into a VRAM group buffer. The application sets this level when creating the group buffer.

The read routine then retrieves as much data from the VRAM buffer as it can, parses it into frames, and writes it out to the AVSS file. The ~ returns to process the message loop and awaits the next AVK_CAPTURE_DATA_AVAILABLE message.

The second method (enabled by selecting the Timer option from the File pop-up menu) involves setting up a Windows timer and calling the same read routine on each timer. (We use a timer tick of 500 milliseconds in AvkCapt.) This will result in a maximum of two calls per second, so the capture function has to write about 15 frames per call to keep up. CaptureAvioData() writes out more than that if more data is present, so the timer messages may back up. Since windows discards these if another set of timer ticks is already waiting in the queue, this is not a problem.

AvkCapt's capturing performance can be tuned by varying the TIMER_INTERVAL, the HOST_BUF_SIZE, or the value for CAPTURE_LOOPS (which dictates how many iterations of the read write loop will be executed in CaptureAvioData() before it is forced back to the main message loop). These values are defined in avkcapt.h; see Listing One, page 90.

Listing Four, page 94, shows the CaptureAvioData() and ReadGrpBuf() functions. These functions retrieve frames from the group buffers in VRAM and write them out to an AVSS file on disk. AvkCapt uses the AVKIO file I/O subsystem to create an AVSS file. Video frames and one frame's worth of audio samples are retrieved separately from their respective buffers. The AVKIO function AvioFileFrmWrite() interleaves the video and audio into the AVSS file. Each iteration of the main loop starts by checking to see whether the application's video or audio host RAM buffer is empty, and, if so, it reads one buffer's worth of frames from the VRAM group buffers. ReadGrpBuf() is used to read newly captured frames from an AVK group buffer into one of the application's host RAM buffers. The count of bytes read is put in the CAPT structure's BufDataCnt element. If any data is
read, the caller's flag, pbDataRead is set to True. Next we loop through the video and audio host RAM buffers writing out matched video and audio frames to the file. When we run out of either video or audio frames, we loop back to the top to retrieve more frames. This loop continues until all frames currently captured in VRAM have been retrieved, or until the loop has executed CAPTURE_LOOPS times. We use this countdown value to prevent the loop from executing for too long without giving the message loop time to run. If frames are being captured as fast as we are reading them, we might otherwise never exit this loop.

It is rather unlikely, but possible, that we will have a re-entrancy problem here. Since the function creates a message box in the case of an error, it can allow the message loop to process new messages before we exit it. This might result in a new timer tick or an AVK message causing reentry before we have finished displaying the error message and killing the process. To prevent this contingency, we use an ownership semaphore. If the semaphore is set when we enter, something has gone wrong in the currently executing code. So, instead of just blocking on the semaphore, the new occurrence exists.

The semaphore is set to indicate that the code is executing and is cleared as the last operation before a successful exit. Note that we do not clear the semaphore before exiting on an error condition, since we will be terminating the application on any error here and do not want to begin executing this code again between this exit and the applications termination.

Conclusion

AVK's function calls are identical between the Windows and OS/2 versions of the library. So the bulk of the code I have discussed will remain the same for an OS/2 implementation of AvkCapt. The main difference will be in the code that writes the data to disk. With OS/2, this can be accomplished using a couple of threads and sharing a common-host memory buffer, eliminating the need for the timer-tick mechanism.

It is also possible to create other configurations using the AVK objects. For example, applications can build recorders that capture only video or only audio data, players that play audio and video data from different sources, or players that combine motion-video and still-image data to a common view.

This kind of flexibility has its costs in terms of code complexity, however. Both the i750 video processor and AVK were architected to be platform independent. AVK can support a variety of higher-level APIs which encapsulate OS-specific file-I/O functions.

As one example, Intel and IBM are working on a higher-level library that implements the digital-video media-control interface (DV MCI) command set for multimedia extensions to Windows and OS/2.

QuickTime will be implemented on top of AVK by New Video (Venice, California) for its DVI Macintosh products. (See "The QuickTime/AVK Connection" on page 28.) Both DV MCI products and QuickTime provide "preconfigured" player/recorder objects for developers who do not need or want to roll their own.

Acknowledgments

The author would like to express his appreciation to John Novack, who developed the AvkCapt program described in this article.

Data Compression and the AVK

The Audio Video Kernel (AVK) is a multilayered architecture that isolates hardware-specific features from the application programmer while enabling porting of audio and video data to other platforms.

The AVK is itself sandwiched between an environment-specific API (in this case, the Windows API) and the Action Media II hardware. The ActionMedia II board includes the i750 video processor, an optional capture module, and typically two megabytes of local video memory (VRAM). The i750 consists of two processors: the 82750PB pixel processor and the 82750DB display processor.

Closest to the hardware is the microcode engine, a collection of routines loaded into instruction memory aboard the pixel processor. These routines manage real-time requirements such as task scheduling, data compression and decompression, and image scaling and copying. The big win here is that by loading these routines into instruction memory, there is no hardwriting, for example, of compression and decompression algorithms. The microcode routines can be modified to add or change functionality without updating the hardware.

The next layer in the AVK, the audio/video driver (AVD), provides a C interface to the ActionMedia II hardware, thus providing access to each component of the board. Included are functions to access VRAM, load microcode functions into instruction memory aboard the pixel processor, set display formats for the display processor, and access the audio and capture subsystems. Intel has also created a conceptual model of a "digital production studio" which contains individual subsystems that correspond to real-world systems such as tape decks, effects processors, mixing boards, and so on. The audio/video library (AVL) adds a set of multimedia functions that are independent of the host environment to implement these concepts.

Compressing Data

The AVK currently supports two forms of compression for video images. The first is real-time video (RTV), which is implemented in microcode and processed on the pixel processor. RTV takes multiple passes over the video data using several techniques, including frame differencing and Huffman coding to reduce the bit rate. Therefore, RTV compression is lossy. RTV 2.0 improves over the original algorithm with better image quality and adjustable data rates of up to 300 Kbytes/second, or twice that of CD-ROM rates. Image quality, which is directly affected by the amount of data lost, can be specified by the application as good, better or best. Good is typically used at lower data rates such as CD-ROM. Better quality is recommended when playback will occur from the hard drive and best is used when compressing to a RAM disk.

Production-level video (PLV) uses essentially the same compression as RTV, but takes advantage of offline compression services to gain the highest quality. Thus, PLV data including audio can be decompressed and displayed at rates similar to best quality RTV, which is closer to 150 Kbytes/second.

AVK uses JPEG for capture and lossy compression of still images, using 4:1:1 YUV color sampling. Though the compression technique is different, still images are treated as a special case of motion video that contains just a single frame. From the AVK perspective, programmers can open, play, and close a still image using the same calls as motion video. Developers can also adjust frequency for still images.

Audio data is compressed using a 4-bit adaptive-compression algorithm (ADPCM4). This is a straightforward technique that predicts the next audio sample based on the previous sample. As with video quality, audio quality can be specified as good, better, or best. But since this algorithm was originally intended for voice samples, it doesn't achieve the high quality one might expect. The encoding of audio data is an area that MPEG greatly improves upon over ADPCM4. Intel promises to support the MPEG standard when it is finalized, so look for big gains here.

Data Streams

A stream is composed of a set of audio or video frames. A video stream consists of a starting reference frame whose entire image is encoded. Subsequent frames, called "dependent" frames, are encoded as changes to previously decompressed images. (Only pixels that change between frames are stored.) Occasionally, when the image significantly changes or image quality begins to deteriorate, a new reference frame is inserted. A benefit of reference frames is that they can be decompressed independent of other frames. Note, however, that you currently cannot seek to a dependent frame--only to a reference frame. Note also that audio frames are independent of video. As previously mentioned, they are compressed using a 4-bit ADPCM algorithm. Once decompressed, audio and video streams can be interleaved on a frame-by-frame basis.

Finally, it's interesting to note that Intel, at the request of the Interactive Multimedia
Association (IMA), is making available details of RTV's compressed video bit-stream format.

Documentation is available to developers on a licensing basis, thus opening the door for software-only decompression of AVSS files. Fluent Machines (Framingham, Massachusetts) is expected to be the first to offer a software-only solution. For more information, contact the IMA, 3 Church Circle, Suite 800, Annapolis, MD 21401; 410-626-1380.

--Michael Floyd

_CAPTURING DIGITAL VIDEO USING DVI_
by James L. Green

Listing One

//--- AvkCapt.h  Copyright Intel Corp. 1991, 1992, All Rights Reserved ---

#include "avkapi.h"

// File name for the RTV 2.0 VSH data file (from avkalg.h)
#define VSHFILE_NAME    AVK_RTV_20_ENCODE_DATA_NAME

// A couple of shorthand AVK #defines for convenience
#define OK      AVK_ERR_OK
#define NOW     AVK_TIME_IMMEDIATE
#define HNULL       ((HAVK)0)

// Values for capturing
#define AUD_SAMPLE_RATE (U32)33075
#define FRAME_RATE  (U32)33367

// Size of Capture Data Buffers
#define VID_BUF_SIZE    (256L * 1024L)
#define VID_BUF_GRAN    ( 64L * 1024L)
#define AUD_BUF_SIZE    (128L * 1024L)
#define AUD_BUF_GRAN    ( 16L * 1024L)
#define HOST_BUF_SIZE   32768U

// Maximum number of iterations of the capture loop
// before we are forced back to the main message loop
#define CAPTURE_LOOPS   10

// ID value for the capture Windows timer
#define TIMER_ID    1

// Number of milliseconds between timer ticks
#define TIMER_INTERVAL  500

// States for the capture engine
#define ST_UNINITIALIZED    0
#define ST_INITIALIZED      1
#define ST_MONITORING       2
#define ST_CAPTURING        3

// Control structure for the current view
typedef struct tagVIEW
{
    HAVK    hView;          // AVK View handle
    HAVK    hConnDigi2Strm;     // Digitizer to Video Stream connector
    HAVK    hConnStrm2View;     // Video Stream to View connector
    BOOL    bConnEnabled;       // TRUE if the connector is enabled
    WORD    DviMonitorSync;     // DviMonitorSync value from AVK.INI
    I16 cxView;         // View's x resolution
    I16 cyView;         // View's y resolution
    double  xDelta;         // used to convert screen
    double  yDelta;         //  coords to view
    I16 cxScreen;       // physical screen's x resolution
    I16 cyScreen;       // physical screen's y resolution
    U16 VidType;        // View's video type
    U16 BmFmt;          // View's bitmap format
    BOOL    bIsKeyed;       // TRUE if the View is keyed
    BOX SrcBox;         // connector's source rectangle
    BOX DstBox;         // connector's destination rectangle
} VIEW;

// Control structure for capture buffers
typedef struct tagCAPT
{
    HAVK        hGrpBuf;    // group buffer handle
    HAVK        hStrm;      // stream handle
    char far    *pBufHead;  // host RAM I/O buffer
    char far    *pBufCurr;  // current position in host I/O buffer
    U32     BufDataCnt; // amount of data in host I/O buffer
} CAPT;

// Structure for storing sync resolutions. The sync table
// will be an array of VIDEO_SYNC structures called Syncs[].
typedef struct tagVIDEO_SYNC
{
    WORD    xResRTV;    // RTV capture x resolution
    WORD    xResVid;    // Video stream premonitor x resolution
    WORD    yResVid;    // Video stream premonitor y resolution
    WORD    FrameRate;
    WORD    PixelAspect;
} VIDEO_SYNC;

// These sync values are subscripts into a table of VIDEO_SYNC structures
#define SYNC_NTSC   0
#define SYNC_PAL    1

Listing Two

// ---- Windows AVK Capture Program - Create Recorder  ----------------
// ---- Copyright Intel Corp. 1991, 1992, All Rights Reserved ---------

extern HWND     hwndMain;
extern VIDEO_SYNC   Syncs[];

// Local variables
static WORD State = ST_UNINITIALIZED; // current state of capture engine
WORD        CaptureSync = SYNC_NTSC;  // default to NTSC
char far    *pVshBuf    = NULL;   // buffer for reading VSH data.
U32     VshSize;    // size of the VSH data
VIEW        View;       // view control structure
AVIO_SUM_HDR    Avio;       // master control struct for AVSS file I/O
CAPT        Vid;        // video capture control structure
CAPT        Aud;        // audio capture control structure
I16     AvkRet;     // general AVK return code variable

// RTV 2.0 encoding arguments
AVK_RTV_20_ENCODE_ARGS Rtv20Args =
{
    12,         // argument count
    AVK_RTV_2_0,        // algorithm size
    0,0,            // x,y coords of origin
    128, 240,       // xLength, yLength
    3,          // still period
    0, 0,           // bytes,lines
    AVK_RTV_20_PREFILTER | AVK_RTV_20_ASPECT_25,    // flags
    0, 0            // quantization values
};

// AVK handles
HAVK    hAvk    = (HAVK)0;
HAVK    hDev    = (HAVK)0;
HAVK    hGrp    = (HAVK)0;

// Create AVK session and initialize the device
BOOL InitAvk()
{
    if (!IsState(ST_UNINITIALIZED))
        return TRUE;

    // Start an AVK session with messaging
    if ((AvkRet = AvkBeginMsg(hwndMain, &hAvk,
        AVK_SESSION_DEFAULT)) != OK)
        return DispAvkErr(AvkRet, "AvkBeginMsg");

    // Open the ActionMedia(R) device
    if ((AvkRet = AvkDeviceOpen(hAvk, 0,
        AVK_DEV_OPEN_EXCLUSIVE, &hDev)) != OK)
        return DispAvkErr(AvkRet, "AvkDeviceOpen");

    // Get the capture sync by calling AvkDeviceVideoIn()
    if ((AvkRet = AvkDeviceVideoIn(hDev, AVK_CONN_DIGITIZER)) != OK)
        return DispAvkErr(AvkRet, "AvkDeviceVideoIn");

    return TRUE;
}
// Check device capabilities and build the recorder
BOOL CreateAvkResources(WORD NewCaptureSync)
{
    switch (NewCaptureSync)
    {
        case AVK_SYNC_NTSC: CaptureSync = SYNC_NTSC;    break;
        case AVK_SYNC_PAL:  CaptureSync = SYNC_PAL;     break;
    }

    // Get the AVK device capabilities from AVK.INI
    if (!GetDevCaps(&View))
        return FALSE;

    if (!CreateView(&View))
         return FALSE;

    // The Vsh file contains data used in compressing
    // the incoming motion video into an RTV 2.0 file.
    if (!LoadVshFile())
        return FALSE;

    if (!CreateCaptureGroup())
        return FALSE;

    ToState(ST_INITIALIZED);

    return TRUE;
}
// Get the device capabilities from AVK
BOOL GetDevCaps(VIEW *pView)
{
    DVICAPS DevCaps;

    // Get the physical screen resolution from the system
    pView->cxScreen = GetSystemMetrics(SM_CXSCREEN);
    pView->cyScreen = GetSystemMetrics(SM_CYSCREEN);

    // Get the AVK device capabilities which were set in AVK.INI
    if ((AvkRet = AvkGetDevCaps(0, sizeof(DevCaps), &DevCaps)) != OK)
        return DispAvkErr(AvkRet, "AvkGetDevCaps");

    if (DevCaps.DigitizerRevLevel == 0)
        return DispErr("GetDevCaps",
          "Digitizer needed for capturing - check AVK.INI");

    if (DevCaps.DviMonitorSync & 0x10)      // VGA
    {
        pView->cxView   = 256;
        pView->cyView   = 240;
        pView->VidType = AVK_VID_VGA_KEYED;
        pView->bIsKeyed = TRUE;
    }
    else if (DevCaps.DviMonitorSync & 0x100)    // XGA
    {
        pView->cxView   =  256;
        pView->cyView   =  192;
        pView->VidType = AVK_VID_XGA_KEYED;
        pView->bIsKeyed = TRUE;
    }
    else if (DevCaps.DviMonitorSync & 0x02)     // PAL
    {
        pView->cxView = 306;
        pView->cyView = 288;
        pView->VidType = AVK_VID_PAL;
    }
    else if (DevCaps.DviMonitorSync & 0x01)     // NTSC
    {
        pView->cxView = 256;
        pView->cyView = 240;
        pView->VidType = AVK_VID_NTSC;
    }
    else
        return DispErr("GetDevCaps", "Invalid monitor sync");

    // Calculate Screen-To-AVK coordinate conversion deltas.
    pView->xDelta = (double)pView->cxView / (double)pView->cxScreen;
    pView->yDelta = (double)pView->cyView / (double)pView->cyScreen;

    return TRUE;
}
// Create and display an AVK View
static BOOL CreateView(VIEW *pView)
{
    if ((AvkRet = AvkViewCreate(hDev, pView->cxView, pView->cyView,
        AVK_YUV9, pView->VidType,  &pView->hView)) != OK)
        return DispAvkErr(AvkRet, "AvkViewCreate");

    // Display the View
    if ((AvkRet = AvkViewDisplay(hDev, pView->hView, NOW,
      AVK_VIEW_DISPLAY_DEFAULT)) != OK)
        return DispAvkErr(AvkRet, "AvkViewDisplay");

    // Set the destination box for the stream-to-view connector
    if (!SetDstBox(hwndMain))
        return FALSE;

    return TRUE;
}
// Set the destination box for the stream-to-view connector
BOOL SetDstBox(HWND hwndMain)
{
    RECT    WinRect;
    BOX NewDstBox;

    GetClientRect(hwndMain, (LPRECT)&WinRect);
    ClientToScreen(hwndMain, (LPPOINT)&WinRect);
    WinRect.right = WinRect.left + (View.cxScreen >> 1) - 1;
    WinRect.bottom = WinRect.top + (View.cyScreen >> 1) -1;
    WinRect2AvkBox(&WinRect, &NewDstBox, &View);

    if (View.hConnStrm2View)
    {
        if ((AvkRet = AvkConnHide(View.hConnStrm2View, NOW)) != OK)
            return DispAvkErr(AvkRet, "AvkConnHide");

        if ((AvkRet = AvkViewCleanRect(View.hView,
            &View.DstBox)) != OK)
            return DispAvkErr(AvkRet, "AvkViewCleanRect");

        // Reset the destination of the connector to our new box
        if ((AvkRet = AvkConnModSrcDst(View.hConnStrm2View, NULL,
            &NewDstBox, NOW)) != OK)
            return DispAvkErr(AvkRet, "AvkConnModSrcDst");

        if ((AvkRet = AvkConnEnable(View.hConnStrm2View, NOW)) != OK)
            return DispAvkErr(AvkRet, "AvkConnEnable");

    }

    // Copy new destination coords into the view's destination box
    COPYBOX(&View.DstBox, &NewDstBox);

    return TRUE;
}
// Get the standard VSH file that comes with AVK
static BOOL LoadVshFile()
{
    int     fhVsh;
    OFSTRUCT    Of;

    // Open the VSH file
    if ((fhVsh = OpenFile(VSHFILE_NAME, &Of, OF_READ)) == -1)
        return DispErr("LoadVshFile",
            "Unable to find the file KE080200.VSH");

    VshSize = filelength(fhVsh);

    // Range check - Reject if VshSize == 0 or VshSize > 65535L
    if (!VshSize || VshSize & 0xffff0000)
        return DispErr("LoadVshFile", "VSH file too large to load");

    // Allocate a buffer to stash the VSH file.
    if ((pVshBuf = MemAlloc((WORD)VshSize)) == NULL)
        return DispErr("LoadVshFile",
            "Unable to allocate VSH file buffer");

    // Read the VSH data from the file
    if (_lread(fhVsh, pVshBuf, (WORD)VshSize) != (WORD)VshSize)
        return DispErr("LoadVshFile", "Unable to read VSH file");

    return TRUE;
}
// Create Capture Group and resources needed for premonitoring
static BOOL CreateCaptureGroup()
{
    if ((AvkRet = AvkGrpCreate(hDev, &hGrp)) != OK)
        return DispAvkErr(AvkRet, "AvkGrpCreate");

    if ((AvkRet = AvkGrpBufCreate(hGrp, AVK_BUF_CAPTURE, VID_BUF_SIZE,
        VID_BUF_GRAN, 1, &Vid.hGrpBuf)) != OK)
        return DispAvkErr(AvkRet, "AvkGrpBufCreate");

    if ((AvkRet = AvkGrpBufCreate(hGrp, AVK_BUF_CAPTURE, AUD_BUF_SIZE,
        AUD_BUF_GRAN, 1, &Aud.hGrpBuf)) != OK)
        return DispAvkErr(AvkRet, "AvkGrpBufCreate");

    // Create host RAM I/O buffers for retrieving
    // video and audio frames and initialize them.
    if ((Vid.pBufHead = MemAlloc(HOST_BUF_SIZE)) == NULL
     || (Aud.pBufHead = MemAlloc(HOST_BUF_SIZE)) == NULL)
        return DispErr("CreateCaptureGroup",
          "Unable to allocate host RAM I/O buffer");
    Vid.BufDataCnt = (U32)0;
    Aud.BufDataCnt = (U32)0;

    if (!CreateVideoStream())
        return FALSE;

    if (!CreateAudioStream())
        return FALSE;

    if ((AvkRet = AvkGrpFlush(hGrp)) != OK)
        return DispAvkErr(AvkRet, "AvkGrpFlush");

    return TRUE;
}
// Create and format a video stream for the video capture buffer
static BOOL CreateVideoStream()
{
    if ((AvkRet = AvkVidStrmCreate(Vid.hGrpBuf, 0, &Vid.hStrm)) != OK)
        return DispAvkErr(AvkRet, "AvkVidStrmCreate");

    // Format the video stream
    Rtv20Args.xLen = Syncs[CaptureSync].xResRTV;
    Rtv20Args.yLen = Syncs[CaptureSync].yResVid;
    if ((AvkRet = AvkVidStrmFormat(Vid.hStrm,
        6,
        Syncs[CaptureSync].xResVid,
        Syncs[CaptureSync].yResVid,
        AVK_YUV9,
        Syncs[CaptureSync].FrameRate,
        AVK_RTV_2_0,
        &Rtv20Args, sizeof(Rtv20Args), sizeof(Rtv20Args),
        pVshBuf, VshSize, 64L * 1024L)) != OK)
        return DispAvkErr(AvkRet, "AvkVidStrmFormat");

    // Free the VSH buffer
    MemFree(pVshBuf);

    // Create a connector from the digitizer to the video stream
    if ((AvkRet = AvkConnCreate(AVK_CONN_DIGITIZER, NULL, Vid.hStrm,
        NULL, 0, &View.hConnDigi2Strm)) != OK)
        return DispAvkErr(AvkRet,
            "AvkConnCreate(Digitizer to Stream)");

    // Create the connector from the video stream to the view
    if ((AvkRet = AvkConnCreate(Vid.hStrm, NULL, View.hView,
      &View.DstBox, AVK_PRE_MONITOR, &View.hConnStrm2View)) != OK)
        return DispAvkErr(AvkRet, "AvkConnCreate (Stream to View)");

    return TRUE;
}
// Create and format a audio stream for the audio capture buffer
static BOOL CreateAudioStream()
{
    if ((AvkRet = AvkAudStrmCreate(Aud.hGrpBuf, 0, &Aud.hStrm)) != OK)
        return DispAvkErr(AvkRet, "AvkAudStrmCreate");

    // Format the audio stream
    if ((AvkRet = AvkAudStrmFormat(Aud.hStrm, FRAME_RATE,
        AUD_SAMPLE_RATE, AVK_ADPCM4, AVK_AUD_MIX, NULL, 0, 0)) != OK)
        return DispAvkErr(AvkRet, "AvkAudStrmFormat");

    return TRUE;
}
// Close the AVK session
BOOL EndAvk()
{
    BOOL    Ret = TRUE;

    if (hAvk != HNULL)
    {
        if ((AvkRet = AvkEnd(hAvk)) != OK)
        {
            DispAvkErr(AvkRet, "AvkEnd");
            Ret = FALSE;
        }
    }
    if (Vid.pBufHead)
    {
        MemFree(Vid.pBufHead);
        Vid.pBufHead = NULL;
    }
    if (Aud.pBufHead)
    {
        MemFree(Aud.pBufHead);
        Aud.pBufHead = NULL;
    }

    // Null out all of the AVK handles
    hAvk = hDev = HNULL;
    hGrp = HNULL;
    Vid.hGrpBuf = Vid.hStrm = HNULL;
    Aud.hGrpBuf = Aud.hStrm = HNULL;
    View.hView  = HNULL;
    View.hConnDigi2Strm = View.hConnStrm2View = HNULL;

    ToState(ST_UNINITIALIZED);

    return Ret;
}

Listing Three

//----  Windows AVK Capture Program - Recorder Control ------------
//----  Copyright Intel Corp. 1991, 1992, All Rights Reserved -----

// Sets a new state and enables/disables the applicable menu options
WORD ToState(WORD NewState)
{
    WORD    OldState;

    if (NewState == ST_CAPTURING
     || NewState == ST_MONITORING
     || NewState == ST_INITIALIZED
     || NewState == ST_UNINITIALIZED)
    {
        if (State != NewState)
        {
            OldState = State;
            State = NewState;
            UpdateMenus(State);
            return OldState;
        }
        else
            return NewState;
    }
    return 0xffff;
}
// Checks whether the current state equals the caller's query state
BOOL IsState(WORD QueryState)
{
    return State == QueryState;
}
// Returns the current state to the caller
WORD GetState()
{
    return State;
}
// Toggle monitoring on and off based on user input
BOOL ToggleMonitor(VOID)
{
    BOOL    bRet;

    switch (GetState())
    {
        case ST_INITIALIZED:    bRet = MonitorOn(); break;
        case ST_MONITORING: bRet = MonitorOff();    break;
        default:        bRet = TRUE;        break;
    }
    return bRet;
}
// Turn on premonitoring
static BOOL MonitorOn()
{
    if ((AvkRet = AvkConnEnable(View.hConnDigi2Strm, NOW)) != OK
     || (AvkRet = AvkConnEnable(View.hConnStrm2View, NOW)) != OK)
        return DispAvkErr(AvkRet, "AvkConnEnable");

    if ((AvkRet = AvkDeviceAudioIn(hDev, AVK_AUD_CAPT_LINE_INPUT,
      AVK_MONITOR_ON)) != AVK_ERR_OK)
        return DispAvkErr(AvkRet, "AvkDeviceAudioIn");

    ToState(ST_MONITORING);

    SetClipTimer();

    return TRUE;
}
// Turn off premonitoring
static BOOL MonitorOff()
{
    KillClipTimer();

    if ((AvkRet = AvkConnHide(View.hConnStrm2View, NOW)) != OK
     || (AvkRet = AvkConnHide(View.hConnDigi2Strm, NOW)) != OK)
        return DispAvkErr(AvkRet, "AvkConnHide");

    if ((AvkRet = AvkDeviceAudioIn(hDev, AVK_AUD_CAPT_LINE_INPUT,
        AVK_MONITOR_OFF)) != AVK_ERR_OK)
        return DispAvkErr(AvkRet, "AvkDeviceAudioIn");

    ToState(ST_INITIALIZED);

    return TRUE;
}
// Toggles the capture on or off
BOOL ToggleCapture()
{
    // If no file has been opened, return
    if (!bAvioFileExists)
    {
        DispMsg("You must open a file before you can capture");
        return TRUE;
    }

    switch(GetState())
    {
        case ST_MONITORING:
            // If we are monitoring, turn on
            // capture by starting the group
            if ((AvkRet = AvkGrpStart(hGrp, NOW)) != OK)
                return DispAvkErr(AvkRet, "AvkGrpStart");
            ToState(ST_CAPTURING);
            break;

        case ST_CAPTURING:
            // If we are already capturing, turn
            // it off by pausing the group
            if ((AvkRet = AvkGrpPause(hGrp, NOW)) != OK)
                return DispAvkErr(AvkRet, "AvkGrpPause");
            break;

        default:
            // Any other state, just do nothing - no error
            break;
    }
    return TRUE;
}

Listing Four

// ---- Windows AVK Capture Program - Write Captured Data to Disk ------
// ---- Copyright Intel Corp. 1991, 1992, All Rights Reserved ----------

extern CAPT Aud;
extern CAPT Vid;
extern I16  AvkRet;
extern HAVK hGrp;
extern WORD CaptureSync;
AVIO_SUM_HDR    Avio;
BOOL        bAvioFileExists = FALSE;
I16     AvioRet;
static BOOL ReadGrpBuf(CAPT *, BOOL *);
I16     DispAvioErr(char *pMsg);

VIDEO_SYNC  Syncs[2] =
{
    { 128, 128, 240, AVK_NTSC_FULL_RATE, AVK_PA_NTSC },
    { 128, 153, 288, AVK_PAL_FULL_RATE,  AVK_PA_PAL  }
};

// Initialize the AVIO summary header and use it to create an AVSS file.
BOOL OpenAvioFile(char *pFileSpec)
{
    AVIO_VID_SUM FAR    *pVid;
    AVIO_AUD_SUM FAR    *pAud;
    VIDEO_SYNC      *pSync;

    if (!*pFileSpec)
        return DispErr("OpenAvioFile", "No file spec");

    // Clear out the Avio structure.

    _fmemset((char FAR *)&Avio, 0, sizeof(Avio));

    // Initialize the structure.
    Avio.SumHdrSize = sizeof(AVIO_SUM_HDR);
    Avio.VidSumSize = sizeof(AVIO_VID_SUM);
    Avio.AudSumSize = sizeof(AVIO_AUD_SUM);

    Avio.StrmCnt = 2;
    Avio.VidCnt = 1;
    Avio.AudCnt = 1;

    if ((AvioRet = AvioFileAlloc((AVIO_SUM_HDR FAR *)&Avio)) < 0)
        return DispAvioErr("AvioFileAlloc");

    // Fill out the video stream substructure.

    pSync = &Syncs[CaptureSync];    // sync data (NTSC or PAL)

    pVid = Avio.VidStrms;

    pVid->StrmNum = 0;      // video stream number
    pVid->Type = AVL_T_CIM;     // compressed data
    pVid->SubType = AVL_ST_YVU; // packed data
    pVid->StillPeriod = AVL_CIM_RANDOM_STILL;  // freq of still frames
    pVid->xRes = pSync->xResVid << 1;      // x resolution
    pVid->yRes = pSync->yResVid;           // y resolution
    pVid->BitmapFormat = AVK_BM_9;         // bitmap format
    pVid->FrameRate = pSync->FrameRate;    // frame rate
    pVid->PixelAspect = pSync->PixelAspect;    // NTSC aspect ratio
    pVid->AlgCnt = 1;              // only one algorithm
    pVid->AlgName[0] = AVK_RTV_2_0;        // RTV 2.0 compression alg

    // Fill out the audio stream substructure.

    pAud = Avio.AudStrms;

    pAud->StrmNum = 1;          // audio stream number
    pAud->LeftVol = 100;            // left channel volume = 100%
    pAud->RightVol = 100;           // right channel volume = 100%
    pAud->FrameRate = pSync->FrameRate; // frame rate
    pAud->SamplesPerSecond = AUD_SAMPLE_RATE;  // audio samples-per-second
    pAud->AudChannel = AVK_AUD_MIX;     // both speakers
    pAud->AlgCnt = 1;           // number of algorithms
    pAud->AlgName[0] = AVK_ADPCM4;      // audio ADPCM4 algorithm

    // Now create the file with all standard AVSS headers.

    if ((AvioRet = AvioFileCreate((char far *)pFileSpec,
      (AVIO_SUM_HDR FAR *)&Avio, OF_CREATE)) < 0)
        return DispAvioErr("AvioFileCreate");

    bAvioFileExists = TRUE;

    return TRUE;
}
// This function retrieves frames from the Group Buffers
// in VRAM and writes them out to an AVSS file on disk.
BOOL CaptureAvioData()
{
    static BOOL     bInUse = FALSE;
    AVIO_FRM_HDR FAR    *pFrmHdr[2];    // frame header pointers
                        // for video & audio
    BOOL            bDataRead;
    int         Ret;
    U32         VidFrmSize, AudFrmSize;
    WORD            Count;

    if (bInUse)
        return TRUE;

    bInUse = TRUE;

    // Error if no buffers have been allocated.
    if (!Vid.pBufHead || !Aud.pBufHead)
        return DispErr("CaptureAvioData",
            "NULL host RAM buffer pointer");

    Count = CAPTURE_LOOPS;

    do {
        // Init the data-read flag
        bDataRead = FALSE;

        if (!Vid.BufDataCnt)
        {
            if (!ReadGrpBuf(&Vid, &bDataRead))
                return FALSE;
        }
        if (!Aud.BufDataCnt)
        {
            if (!ReadGrpBuf(&Aud, &bDataRead))
                return FALSE;
        }

        while (Vid.BufDataCnt && Aud.BufDataCnt)
        {
            pFrmHdr[0] = (AVIO_FRM_HDR FAR *)Vid.pBufCurr;
            pFrmHdr[1] = (AVIO_FRM_HDR FAR *)Aud.pBufCurr;

            if ((Ret = AvioFileFrmWrite((AVIO_SUM_HDR FAR *)&Avio,
                pFrmHdr)) < 0)
                return DispAvioErr("AvioFileFrmWrite");

            VidFrmSize = (U32)sizeof(AVIO_FRM_HDR)
                + pFrmHdr[0]->StrmSize[0];
            Vid.pBufCurr += (WORD)VidFrmSize;
            Vid.BufDataCnt -= VidFrmSize;

            AudFrmSize = (U32)sizeof(AVIO_FRM_HDR)
                + pFrmHdr[1]->StrmSize[0];
            Aud.pBufCurr += (WORD)AudFrmSize;
            Aud.BufDataCnt -= AudFrmSize;
        }
    } while (bDataRead && Count--);

    bInUse = FALSE;
    return TRUE;
}
// Read newly captured frames from an AVK Group Buffer
// into one of the application's host RAM buffers
static BOOL ReadGrpBuf(CAPT *pCapt, BOOL *pbDataRead)
{
    // Only refill the buffer if it is empty
    if (!pCapt->BufDataCnt)
    {
        // Retrieve a buffer of frames.

        if ((AvkRet = AvkGrpBufRead(pCapt->hGrpBuf, HOST_BUF_SIZE,
          pCapt->pBufHead, &pCapt->BufDataCnt, AVK_ENABLE)) != OK)
            return DispAvkErr(AvkRet, "AvkGrpBufRead");

        // Set data-read flag if we read any data.

        *pbDataRead = pCapt->BufDataCnt == (U32)0 ? FALSE : TRUE;

        // Point back to start of buffer.

        pCapt->pBufCurr = pCapt->pBufHead;
    }
    return TRUE;
}
// Update and lose an AVSS file using AVKIO.
BOOL CloseAvioFile()
{
    if (bAvioFileExists == TRUE)
    {
        // Update the file's header with current information that
        // AVKIO keeps in the Avio summary header.

        if ((AvioRet = AvioFileUpdate((AVIO_SUM_HDR FAR *)&Avio, 0)) < 0)
            return DispAvioErr("AvioFileUpdate");

        // Close the file.

        if ((AvioRet = AvioFileClose((AVIO_SUM_HDR FAR *)&Avio)) < 0)
            return DispAvioErr("AvioFileClose");

        bAvioFileExists = FALSE;
    }
    return TRUE;
}

James L. Green - James is a senior software engineer at Intel's multimedia and supercomputing components group in Princeton, New Jersey. He is one of the principal architects of the audio video kernel and is a member of the Interactive Multimedia Association's technical working group on multimedia software architectures. You can reach him through the DDJ offices.

Content created and/or collected by:
Louis F. Ohland, Peter H. Wendt, David L. Beem, William R. Walsh, Tatsuo Sunagawa, Tomáš Slavotínek, Jim Shorney, Tim N. Clarke, Kevin Bowling, and many others.

Ardent Tool of Capitalism is maintained by Tomáš Slavotínek.
Last update: 08 May 2024 - Changelog | About | Legal & Contact