B64 IBM M-Audio Capture and Playback Adapter Audio Application Programming Interface Functional Description Version 2.00 "C" Language March 9th, 1992 (C) Copyright IBM Corporation. 1991 SUMMARY OF AMENDMENTS _____________________ VERSION 2.00 - APRIL 2, 1990 ____________________________ ù Pulse Code Modulation (PCM) support Ä 8 or 16 bit sample size Ä 8000, 11025, 22050, or 44100 sample rates Ä Mono or stereo Ä Support of Microsoft RIFF WAVE file format ù Source Mix support Allows the mixing of an analog signal from any of the M-ACPA input sources with the output of playing a PCM file. ù Save/Restore support Allows the caller to save the current state of a play or record opera- tion, stop the operation, and then restore the saved state and restart the operation. ù Added a function to return the size of audio data structures ù Removed support for ADPCM mix function Formatting changes are not identified. Similarly, minor clarifications, grammatical changes, spelling corrections, etc. are not identified. Revisions for version 2.00 are marked with a vertical bar. CHANGE HISTORY ______________ ù Version 1.02 - 11 January 90 Initial release in Audio Visual Connection product. ù Version 1.03 - July 2, 1990 Ä Added support for MIDI audio type Ä Added Pause/Resume controls Ä Added support for High Quality Music (ADPCM 22K MONO) Summary of Amendments ii ù Version 2.00 - 2 April 91 Initial release with M-ACPA. Current version Summary of Amendments iii CONTENTS ________ 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.1 High Level Example . . . . . . . . . . . . . . . . . . . . . . . 3 3. AAPI Functions . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1 Audio File Functions . . . . . . . . . . . . . . . . . . . . . 4 3.1.1 FAB_TYPE - Determine type of audio file . . . . . . . . . . 4 3.1.2 FAB_OPEN - Open an AVC audio file . . . . . . . . . . . . . 6 3.1.3 FAB_ROPN - Open a RIFF WAVE audio file . . . . . . . . . . 9 3.1.4 FAB_SAVE - Save changes in an audio file . . . . . . . . . 11 3.1.5 FAB_CLOSE - Close an audio file . . . . . . . . . . . . . . 12 3.2 Audio Control Functions . . . . . . . . . . . . . . . . . . . . 13 3.2.1 AUD_SIZE - Return size of audio structures . . . . . . . . 13 3.2.2 AUD_CFIG - Get audio device configuration . . . . . . . . . 14 3.2.3 AUD_INIT - Initialize audio processing . . . . . . . . . . 15 3.2.4 AUD_SET - Set up an audio operation . . . . . . . . . . . . 17 3.2.5 AUD_STRT - Start an audio operation . . . . . . . . . . . . 26 3.2.6 AUD_CTRL - Control an audio operation . . . . . . . . . . 31 3.2.7 AUD_TERM - Terminate audio processing . . . . . . . . . . . 35 4. Control Data Structures . . . . . . . . . . . . . . . . . . . . . 37 4.1 Audio Device Control Block (ADCB) . . . . . . . . . . . . . . . 37 4.2 Audio Start/Stop List Structure (ALST) . . . . . . . . . . . . . 38 4.3 Audio Control Block (ACB) . . . . . . . . . . . . . . . . . . . 39 4.4 Audio Format Structure (AFMT) . . . . . . . . . . . . . . . . . 47 5. Usage Examples . . . . . . . . . . . . . . . . . . . . . . . . . 49 5.1 Check audio hardware and software configuration . . . . . . . . 49 5.2 Play (single channel) . . . . . . . . . . . . . . . . . . . . . 49 5.3 Record/Monitor . . . . . . . . . . . . . . . . . . . . . . . . . 50 5.4 Play (two channel) . . . . . . . . . . . . . . . . . . . . . . . 50 6. AVC Audio File Format . . . . . . . . . . . . . . . . . . . . . . 52 6.1 Audio File Overview . . . . . . . . . . . . . . . . . . . . . . 52 6.1.1 Audio File Types . . . . . . . . . . . . . . . . . . . . . . 52 6.1.1.1 "AUDIO" File . . . . . . . . . . . . . . . . . . . . . . 52 6.1.1.2 "ESCAPE" FILE . . . . . . . . . . . . . . . . . . . . . 52 6.1.2 Audio Object Types . . . . . . . . . . . . . . . . . . . . . 52 6.2 File Data Structures . . . . . . . . . . . . . . . . . . . . . . 55 6.2.1 Directory Control Block . . . . . . . . . . . . . . . . . . 56 6.2.2 File Access Block (FAB) . . . . . . . . . . . . . . . . . . 59 6.2.3 Audio Objects . . . . . . . . . . . . . . . . . . . . . . . 62 6.2.3.1 Object Descriptions . . . . . . . . . . . . . . . . . . 63 7. RIFF WAVE Audio File Format . . . . . . . . . . . . . . . . . . . 68 7.1 Audio File Overview . . . . . . . . . . . . . . . . . . . . . . 68 Appendix A. OS/2 Considerations . . . . . . . . . . . . . . . . . . 69 Contents iv A.1 OS/2 Usage . . . . . . . . . . . . . . . . . . . . . . . . . . 69 A.2 OS/2 Device Driver . . . . . . . . . . . . . . . . . . . . . . 69 Appendix B. DOS Considerations . . . . . . . . . . . . . . . . . . . 70 B.1 DOS Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 B.2 DOS Device Driver . . . . . . . . . . . . . . . . . . . . . . . 70 B.3 DOS Reentrancy . . . . . . . . . . . . . . . . . . . . . . . . 70 B.4 Expanded Memory . . . . . . . . . . . . . . . . . . . . . . . . 70 Appendix C. Additional Information on PCM Support . . . . . . . . . 71 C.1 Significance of the Different PCM Modes . . . . . . . . . . . . 71 C.1.1 Sample Rate . . . . . . . . . . . . . . . . . . . . . . . . 71 C.1.2 Sample Width . . . . . . . . . . . . . . . . . . . . . . . . 73 C.2 Mu-Law and A-Law Companding . . . . . . . . . . . . . . . . . . 76 C.3 Dither . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 C.4 Volume, Balance, Ramp and Pan . . . . . . . . . . . . . . . . . 80 C.4.1 Master Volume . . . . . . . . . . . . . . . . . . . . . . . 80 C.4.2 Track Volume . . . . . . . . . . . . . . . . . . . . . . . . 81 C.4.3 Ramp Rate . . . . . . . . . . . . . . . . . . . . . . . . . 81 C.4.4 Balance . . . . . . . . . . . . . . . . . . . . . . . . . . 82 C.4.5 Pan Rate . . . . . . . . . . . . . . . . . . . . . . . . . . 83 C.5 Source Mixing . . . . . . . . . . . . . . . . . . . . . . . . . 83 Contents v 1. INTRODUCTION ________________ The Audio Application Programming Interface (AAPI) is a set of high level "C" functions allowing easy application program access to the audio hard- ware function and the stored digitized audio. These functions are intended to be at a level appropriate to shield the application programs from the hardware and the various digitized audio formats, but not so high a level as to implement functions unique to various applications. There are two function sets in the AAPI. The first set allows access to audio files, and the data within the files. The second set of functions is used to control audio operations such as play and record. There are two types of audio file formats supported by the AAPI. The Audio Video Connection file format can be is used to play and record com- pressed audio. These files contain audio data that was compressed using an Adaptive Differential Pulse Code Modulation (ADPCM) technique. The Microsoft Resource Interchange File Format Waveform Audio File Format (RIFF WAVE) is used to play and record un-compressed audio. These files contain Pulse Code Modulation (PCM) data. The suggested way to quickly understand the AAPI and get started writing an application is to browse this document and then study the example pro- grams (play.c, 2play.c, and record.c) provided with the AAPI package. After studying the example programs use this document as a reference for more detailed information. 1. Introduction 1 2. OVERVIEW ____________ The Audio Application Programming Interface (AAPI) is a set of high level "C" functions intended to allow applications to add audio capabilities such as playing and recording with a minimum of effort. The AAPI is file oriented. The application is not required to understand the audio file formats to use the AAPI. The AAPI has functions to open, save, and close audio files. Once the file is open the application is not required to read the audio data to or from the file but only to pass the information about the file returned on the open to subsequent audio functions. The AAPI supports two types of audio files, AVC and RIFF WAVE. The AVC audio file actually consists of two physical files. One of the files con- tains the actual digitized data and is called an "escape" file. The second file contains general information about the audio and sub- components called "objects". Together they are referred to as an AVC audio file. See 6, "AVC Audio File Format" on page 52 for a complete description of the AVC file format. The Microsoft RIFF WAVE file consists of only one physical file. The digitized data and sub-components (called chunks) are contained in the same file. See 7, "RIFF WAVE Audio File Format" on page 68 for more information on the RIFF WAVE format. Although these two file formats are quite different, once the AAPI is used to open either type of file they are represented to the application using the same in memory structure. This structure called a File Access Block (FAB) is filled out and returned when the file is opened. The application does not need to understand the FAB information but just passes a pointer to the FAB to subsequent AAPI function calls. The FAB is described in detail in 6.2.2, "File Access Block (FAB)" on page 59. Once the audio file is open and ready for use the application is ready to use the AAPI control functions. In general these functions take as input a structure called the Audio Control Block (ACB). See 4, "Control Data Structures" on page 37 for a complete description of the ACB and all other control structures. Once initialized the ACB must be maintained between calls with the application only changing defined input fields for a par- ticular function. The current state of the audio process is contained in the ACB. The application can look at the ACB at any time to get status information once an audio operation such as play or record is in progress. 2. Overview 2 2.1 HIGH LEVEL EXAMPLE _______________________ Below is a high level example of using the AAPI function calls. See 5, "Usage Examples" on page 49 and the actual "C" example programs shipped with the AAPI for more detailed examples. Determine type of audio file (FAB_TYPE) If AVC file Open the file (FAB_OPEN) Else RIFF WAVE file Open the file (FAB_ROPN) Endif Initialize the Audio Control Block (AUD_INIT) Determine audio configuration (AUD_CFIG) Set up for playing or recording (AUD_SET) Start playing or recording (AUD_START) While play or record operation in progress Application specific tasks Check Audio Control Block for status and position Control volume or balance (AUD_CTRL) End While Stop playing or recording (AUD_CTRL) Terminate the Audio Control Block (AUD_TERM) Save the file if recording has added data (FAB_SAVE) Close the audio file (FAB_CLOSE) 2. Overview 3 3. AAPI FUNCTIONS __________________ The Audio Application Program Interface functions will be delivered in DOS as a "C" large model static link library and in OS/2 as a dynamic link library (DLL). See Appendix A, "OS/2 Considerations" on page 69 and Appendix B, "DOS Considerations" on page 70 for additional information on using the AAPI under specific operating systems. 3.1 AUDIO FILE FUNCTIONS __________________________ The following functions provide the interface to access audio files, and build the necessary data structures to interface with the audio control functions. | 3.1.1 FAB_TYPE - DETERMINE TYPE OF AUDIO FILE | Description: ____________ | | Attempts to open the requested file and determines if it is a known type | of audio file. | | Synopsis: _________ | | int fab_type(name, type) /* Get type of file */ | char *name; /* Pointer to file name */ | unsigned int *type; /* Pointer to type of file */ | | Input Parameters: _________________ | | Itemized below are the input parameters that must be initialized properly | before the fab_type function is called. | | | NAME - Pointer to fully qualified file name | | TYPE - Pointer to field to return type of file | | Returned types are: 0 = No file exists | 1 = Unknown type | 2 = AVC | 4 = RIFF WAVE | | | | | Output Return Values: _____________________ | | 0, Successful | 3303, Open failed fatal error 3. AAPI Functions 4 | Function: _________ | | 1. Checks for existence of file. | | 2. If exists, opens the file and checks header for type. | | 3. Closes the file. 3. AAPI Functions 5 3.1.2 FAB_OPEN - OPEN AN AVC AUDIO FILE Description: ____________ Creates and/or opens an AVC audio file and initializes/reads all objects associated with the file including the escape file. Synopsis: _________ int fab_open(name, create, /* Open audio file */ expand,object_flag,fabpp, /* */ aud_sz, vol_sz, /* */ pnt_sz, lab_sz) /* */ char name; /* File name */ unsigned char create; /* Create or open file */ unsigned char expand; /* Expand objects in memory */ unsigned char object_flag; /* Specify objects to process */ struct fab *fabpp; /* Pointer to FAB pointer */ unsigned int aud_sz; /* Size to allocate audio object */ unsigned int vol_sz; /* volume object */ unsigned int pnt_sz; /* points object */ unsigned int lab_sz; /* label object */ Input Parameters: _________________ Itemized below are the input parameters that must be initialized properly before the fab_open function is called. NAME - A null terminated string that contains the fully qualified name of the file to access. CREATE - Operation to perform on file 01h = Create then open - Return error if file exists 02h = Open - Return error if file does not exist 03h = Any - Open the file, if it does not exist then create and open EXPAND - Size of objects to allocate 00h = Use existing size when allocating objects 01h = Allocate objects at maximum in memory size regardless of current size 02h = Allocate using values passed by caller or at current size whichever is greater. (AUD_SZ, VOL_SZ, PNT_SZ, LAB_SZ) Sizes for objects not being processed will be ignored. 3. AAPI Functions 6 OBJECT_FLAG - "Audio" and "Escape" objects will always be processed. Additional objects will be processed based on this flag: (Set bits individually) 01h = Process "Volume" object 02h = Process "Points" object 04h = Process "Labels" object FABPP - Pointer to area to store allocated and initialized FAB pointer AUD_SZ - Size in bytes to allocate for the audio object. VOL_SZ - Size in bytes to allocate the volume object. PNT_SZ - Size in bytes to allocate the point object. LAB_SZ - Size in bytes to allocate the label object. Output Return Values: _____________________ 0, Successful 3301, Open failed, file already exists 3302, Open failed, file doesn't exist 3303, Audio file open error 3304, Allocation of FAB failed, insufficient storage 3305, Allocation of object failed, insufficient storage 3306, Audio File read error 3307, Escape file open error Function: _________ 1. Allocates and initializes the FAB and FABOS. 2. Creates/opens the audio file as requested. 3. Allocates the object header and data sections. 4. Reads the objects in from the audio file. 5. Creates/opens the escape file. Notes: ______ ù Any audio operation that will expand the "audio" object, such as record or monitor, requires the "volume" object. The "points" and and "labels" objects are always optional. ù The audio file name and its extension are used to build the escape file name. If the audio file has a three character extension then the first character of the extension is used in the extension of the 3. AAPI Functions 7 escape file. If the audio file has no extension or less than three characters only ".ad" is used. For example, AUDIO FILE NAME ESCAPE FILE NAME name.!au name.!ad name._au name._ad name.xau name.xad name name.ad name.xx name.ad 3. AAPI Functions 8 | 3.1.3 FAB_ROPN - OPEN A RIFF WAVE AUDIO FILE | Description: ____________ | | Creates and/or opens a RIFF WAVE audio file. | | Synopsis: _________ | | int fab_ropn(name, create, /* Open audio file */ | fabpp, fmtp) /* */ | char name; /* File name */ | unsigned char create; /* Create or open file */ | struct fab *fabpp; /* Pointer to FAB pointer */ | struct afmt *fmtp; /* Pointer to PCM format data */ | | Input Parameters: _________________ | | Itemized below are the input parameters that must be initialized properly | before the fab_ropn function is called. | | | NAME - A null terminated string that contains | the fully qualified name of the file to access. | | CREATE - Operation to perform on file | 01h = Create then open - Return error if file exists | 02h = Open - Return error if file does not exist | 03h = Any - Open the file, if it does not exist | then create and open | | | FABPP - Pointer to area to store allocated | and initialized FAB pointer | | | FMTP - Pointer to area to store PCM format data. | If opening an existing file then the AFMT | structure will be filled in by the FAB_ROPN | call. If creating a new file then the caller | must initialize the structure, selecting what | type of PCM (sampling rate, etc) the file | should have. In both cases the AFMT structure | will then be passed as input to other AAPI calls. | | Output Return Values: _____________________ | | 0, Successful | 3301, Open failed, file already exists | 3302, Open failed, file doesn't exist | 3304, Allocation of FAB failed, insufficient storage | 3307, File open error | | Function: _________ 3. AAPI Functions 9 | 1. Allocates and initializes the FAB and FABOS. | | 2. Creates/opens the audio file as requested. | | Notes: ______ | | ù Source Mixing | | An additional function is available when playing PCM files. This | function allows any live input source from the ACPA card such as | microphone or line input to be played and controlled just like playing | any PCM file. To access this function, call FAB_ROPN with NAME = | "MSRCMIX$" and CREATE = 2. Once this device has been opened then use | it just like any other PCM file. 3. AAPI Functions 10 3.1.4 FAB_SAVE - SAVE CHANGES IN AN AUDIO FILE Description: ____________ Locates all the objects in an audio file and saves the ones that have been marked as changed or that have been requested to be saved by the caller. If the file has never been saved, all objects are saved. Synopsis: _________ int fab_save(fabp, oupdates) /* Save changes to an audio file */ struct fab *fabp; /* Pointer to FAB */ unsigned int oupdates; /* Specify objects to update */ Input Parameters: _________________ Itemized below are the input parameters that must be initialized properly before the fab_save function is called. FABP - Pointer to FAB OUPDATES - Objects to update (Set bits individually) 01h = Save audio object 02h = Save volume object 04h = Save escape object 08h = Save points object 10h = Save labels object Output Return Values: _____________________ 0, Successful 3304, Insufficient storage to complete operation 3308, Audio file write failed Function: _________ 1. Locate each object. 2. Mark object for save if file never saved or caller requested save. 3. Write the objects marked for change to disk. Notes: ______ ù The Audio Control Block field "oupdates" may be passed directly to this function (as the "oupdates" parameter) after an audio control operation. This will write any objects that were changed during the audio control operation (such as record) to the audio file. 3. AAPI Functions 11 3.1.5 FAB_CLOSE - CLOSE AN AUDIO FILE Description: ____________ Closes the audio file and de-allocates all storage associated with the audio file. Synopsis: _________ int fab_close(fabp) /* Close an audio file */ struct fab *fabp; /* Pointer to FAB */ Input Parameters: _________________ Itemized below are the input parameters that must be initialized properly before the fab_close function is called. FABP - Pointer to FAB Output Return Values: _____________________ 0, Successful Function: _________ 1. Close the escape file (AVC file type). 2. Close the audio file. 3. De-allocate objects. 4. De-allocate the FAB and FABOs. 3. AAPI Functions 12 3.2 AUDIO CONTROL FUNCTIONS _____________________________ The following functions provide the interface to access and control the audio hardware/digital signal processing code and its supported oper- ations. These functions interface with operating system files using the File Access Block (FAB) structure. This interface to files can be set up using the audio file functions provided by this program or the caller can do his own set up and use only these audio control functions. | 3.2.1 AUD_SIZE - RETURN SIZE OF AUDIO STRUCTURES | Description: ____________ | | Returns the length in bytes of the requested structure. | | Synopsis: _________ | | int aud_size(int) /* Return size of structure */ | int structure_type; /* Type of structure */ | | Input Parameters: _________________ | | Itemized below are the parameters that must be initialized properly before | the aud_size function is called. | | | structure_type - Type of audio structure to size | 01h - Audio Device Structure | 02h - Audio Control Block | 03h - Save/Restore Area | 04h - Start/Stop List | | Output Return Values: _____________________ | | | N, Size in bytes of structure 3. AAPI Functions 13 3.2.2 AUD_CFIG - GET AUDIO DEVICE CONFIGURATION Description: ____________ Access and return configuration information and status on the audio device and any corresponding device driver. Synopsis: _________ int aud_cfig(adcbptr) /* Get configuration information */ struct adcb *adcbptr; /* Pointer to Audio Device CB */ Input Parameters: _________________ Input ADCB: ___________ No input fields. Output Return Values: _____________________ 0, Successful, audio device installed and active 3224, Failed, no audio device installed 3225, Failed, audio device interrupts are disabled 3226, Failed, device driver not responding Output ADCB: ____________ Itemized below are the ADCB fields set by the aud_cfig function. DEVICE_ID - Microchannel device ID 6E6Ch = IBM M-Audio Capture and Playback Adapter IOBASE - Audio card base address for program I/O INTLEVEL - Audio card hardware interrupt level OS2RTN - OS/2 error code Function: _________ 1. If operating system is OS/2, verifies the device driver is installed and active. If an error occurs accessing the device driver, returns (3226) and also returns the OS/2 error code in the ADCB. 2. Verifies that an audio device is installed, returns (3224) otherwise. 3. Verifies that interrupts are not disabled, returns (3225) otherwise. 4. If an audio device is installed and active, sets configuration infor- mation in the ADCB and returns (0). 3. AAPI Functions 14 3.2.3 AUD_INIT - INITIALIZE AUDIO PROCESSING Description: ____________ Initialize a caller passed Audio Control Block for subsequent AAPI calls. Synopsis: _________ int aud_init(acbptr) /* Initialize audio processing */ struct acb *acbptr; /* Pointer to Audio Control Block */ Input Parameters: _________________ Input ACB: __________ Itemized below are the ACB fields that must be initialized properly before the aud_init function is called. ACB2PTR - ACB pointer of second track to process 00000000h = Single track All other ACB fields will be initialized by this function. The following are the default values for the audio controls if the user does not explicitly change them: ù I/O buffer length if AAPI allocates storage - 32K bytes ù Master volume level - 100% ù Track volume level - 100% ù Channel balance - 50% ù Output channel pair - A(100%)/B(0%) ù Audio compression type - AVC ADPCM/11K The ACB must be maintained between calls (aud_init through aud_term) with the user changing only defined input fields for a particular function call. Output Return Values: _____________________ 0, Successful Output ACB: ___________ Itemized below are the ACB fields set by the aud_init function. All output fields marked with "*" are updated at one tenth of a second inter- vals requiring no function call once an asynchronous (play, record) audio operation has been started. 3. AAPI Functions 15 * POSITION - Current position in audio (milliseconds) * STATE - Current state of signal processor 00h = stopped 01h = playing 02h = recording 03h = stopping * TIMELEFT - Time left before I/O necessary (ms) OUPDATED - Indication of which objects were changed by AAPI ACB2RC - Return code for second ACB operation Function: _________ 1. All input and output fields (except ACB2PTR) will be initialized. 2. If a secondary ACB is passed in ACB2PTR that ACB will also be initial- ized. Notes: ______ ù This function will always return successful. 3. AAPI Functions 16 3.2.4 AUD_SET - SET UP AN AUDIO OPERATION Description: ____________ Set up the signal processor and audio objects so that subsequent calls to start and stop audio operations happen immediately. Synopsis: _________ int aud_set(acbptr) /* Set up audio operations */ struct acb *acbptr; /* Pointer to Audio Control Block */ Input Parameters: _________________ Input ACB: __________ Itemized below are the ACB fields that must be initialized properly before the aud_set function is called. Only areas listed should be altered. All other initialization of fields is done on a aud_init call which is required before any other type of AAPI control function calls are made. The ACB must be maintained between calls (aud_init through aud_term) with the user changing only defined input fields for a particular function call. 3. AAPI Functions 17 CHANNEL - Signal processor channel identifier 00h = Channel A 01h = Channel B | FFh = An available channel will be selected and | returned in this field. DSPMODE - Set up signal processor for this audio operation 01h = Play 02h = Record 03h = Monitor SEEKTYPE - Set up audio object(s) at this position 00h = Initialize for new/different audio object 01h = Seek to new position in current audio object 02h = Continue at next position in current object 03h = Release resources only 04h = Release resources, then initialize new object AUDTYPE - The type of audio compression to use. 00h = Use current for existing or default for new 01h = Use ADPCM 11K (AVC mono music) 02h = Use ADPCM 5.5K (AVC mono voice) 03h = Use ADPCM 22K (AVC stereo music) 04h = Use ADPCM 22K (AVC mono high quality music) | 3Ch = Use PCM (RIFF WAVE file) Ignored if "DSPMODE" is not record or monitor. | ADPCM (types 0-4) can only be used with AVC format | audio files. PCM (type 3Ch) can only be used with | RIFF WAVE files. 3. AAPI Functions 18 INTLEVEL - Audio card hardware interrupt level INPSRCE - Audio card input source 00h = Microphone input - normal gain 01h = Line level input - left channel 02h = Line level input - right channel 03h = Line level input - both channel 04h = Microphone input - low gain Ignored if "DSPMODE" is not record or monitor | unless doing "source mix". Valid options for | source mix are 0, 3, and 4. PIOBASE - Audio card base address for port I/O CONTROLS - Control changes requested (Set bits individually) 01h = Master volume 02h = Track volume - 1 04h = Track volume - 2 08h = Channel balance 10h = Set stop position 40h = Pause current operation 80h = Resume current operation Settings "40h and 80h" are mutually exclusive. All or none of these controls may be requested. OPRPARMS - Operation parameters flag (Set bits individually) 02h = Do not update audio/volume object totals 04h = PS/2 speaker enabled, ignored on non-PS/2 10h = Monitor - record pre-compression 20h = Monitor - record post-compression 200h = Cancel operation if switched to background 400h = Setup VOICE/MIDI DSP environment Settings "10h and 20h" are mutually exclusive and ignored if DSPMODE is not monitor. FILEPTR - File Access Block pointer SUBTYPE - FABO subtype of objects AUDSTART - Start position for seek (milliseconds) 00000000h = Start of data AUDEND - End position for seek (milliseconds) 00000000h = End of data BUFFLEN - I/O buffer length BUFFPTR - I/O buffer pointer 3. AAPI Functions 19 EMMHAN - Expanded memory manager handle Ignored if "MEMID" is not set to "LIM". EMMCNT - Expanded memory manager page count Ignored if "MEMID" is not set to "LIM". MEMID - Memory type id 00h = not allocated 01h = main memory 02h = LIM memory 03h = Device If memory is not allocated by the caller, the AAPI will attempt to allocate the size specified in BUFFLEN from main memory. If BUFFLEN is zero a default length of 32K bytes will be allocated. If the AAPI allocates the memory it will be released when the caller does an aud_term. CTLPARMS - Control parameters flag (Set bits individually) 01h = Stop - Ignore other control requests (on) Stop - Honor pending control requests (off) Setting "01h" is ignored if CONTROLS is not set for "set stop position". 02h = Purge control queue for this channel 08h = Pause/resume all channels (on) Pause/resume only this channel (off) Setting "08h" is ignored if CONTROLS is not set for "pause" or "resume". | 10h = Restore the previous control state. On | subsequent calls to "aud_set" this flag can | be set to restore the control settings in | effect when the last operation was stopped. | If the caller passed a pointer to a save area | (SVRSPTR) on the initial call to "aud_set" and | is resuming at the same place in | the same file (SEEKTYPE=2), then the | current state and any pending controls | that were queued but not yet executed will | be re-queued (including stop requests). If no | save area was passed or not restarting where | the last operation was stopped then only the | last control settings will be restored. | | MASVOL - Master volume level (ADPCM 0 - 100/PCM 0 - 120) Required if "CONTROLS" is set for master volume. TRKVOL1 - Track volume level (0 - 100) Required if "CONTROLS" is set for track volume-1. TRKVOL1S - Start position for track volume change (ms) Required if "CONTROLS" is set for track volume-1. 3. AAPI Functions 20 TRKVOL1E - End position for track volume change (ms) Required if "CONTROLS" is set for track volume-1. TRKVOL2 - Track volume level (0 - 100) This field applies to exactly the same control as TRKVOL1. This second field gives the caller the ability to specify two volume fades at different positions in the audio. If the positions overlap the caller will be overriding the previous control change that was requested. Required if "CONTROLS" is set for track volume-2. 3. AAPI Functions 21 TRKVOL2S - Start position for track volume change (ms) Required if "CONTROLS" is set for track volume-2. TRKVOL2E - End position for track volume change (ms) Required if "CONTROLS" is set for track volume-2. CHNBAL - Channel balance (pan) level (100 - 0) Required if "CONTROLS" is set for channel balance. OUTCHAN - Output channel pair for balance 0 = Channel left/right 1 = Channel right/left Required if "CONTROLS" is set for channel balance. CHNBALS - Start position for channel balance change (ms) Required if "CONTROLS" is set for channel balance. CHNBALE - End position for channel balance change (ms) Required if "CONTROLS" is set for channel balance. STOPPOS - Position to stop current operation (ms) Required if "CONTROLS" is set for set stop position. ACB2PTR - ACB pointer of second track to process 00000000h = Single track LISTPTR - Pointer to an Audio Start/Stop List structure. This structure contains additional start/stop positions to be used during play operations. This structure also contains a pointer to the next List structure. This linked list of positions is not used until the position passed in "AUDEND" has been reached. 00000000h = No additional positions DSP_PATH - Pointer to a fully qualified directory name to use when accessing DSP files. If this field is zero then the current directory is used. | SVRSPTR - Pointer to a Save/Restore area to be used to restart | play and record. See the CTRPARMS description for | more information. | FMTPTR - Pointer to an audio format area that contains | additional information about the type of audio | being played or recorded. Required only if | using a RIFF WAVE file. Set to 0 for AVC files. 3. AAPI Functions 22 Output Return Values: _____________________ 0, Successful 3201, "Audio" object read error 3202, "Audio" object not found 3203, "Volume" object read error 3204, "Volume" object not found 3205, "Escape" object read error 3206, "Escape" object not found 3207, "Escape" file read error 3208, I/O buffer allocation failed 3209, "audstart" position invalid 3210, "audend" position invalid 3211, "Escape" file seek error 3212, DSP card not responding 3213, DSP buffer allocation failed 3214, DSP program not found 3215, DSP program not readable 3216, "dspmode" invalid or audio file/microcode versions don't match 3218, Unable to install audio interrupt service routine 3219, "Escape" file write error 3220, Control queue overflow, control ignored 3221, EMS Save Page Map failed 3222, EMS Map Page failed 3223, EMS Restore Page Map failed 3224, Microchannel system, but no audio device installed 3226, Device driver not responding 3233, DSP running and requested operation requires reload of DSP 3234, MIDI audio type can not be played on Channel A or in DOS 3235, Device driver error occurred when writing MIDI data 3236, Operating system call failed, system inconsistency Output ACB: ___________ Itemized below are the ACB fields set by the aud_set function. All output fields marked with "*" are updated at one tenth of a second intervals requiring no function call once an operation (play, record) has been started. * POSITION - Current position in audio (ms) * STATE - Current state of signal processor 00h = stopped 01h = playing 02h = recording 03h = stopping * TIMELEFT - Time left before I/O necessary (ms) OUPDATED - Indication of which objects were changed by AAPI ACB2RC - Return code for second ACB operation 3. AAPI Functions 23 Function: _________ 1. Based on the requested DSPMODE, if the signal processor is not already in the required mode, the current mode is terminated and the new mode is initialized. 2. Based on SEEKTYPE, if the current audio object(s) are no longer needed, the current objects are purged (see notes for more detail) and the requested objects are read as necessary. 3. An open (if not already open) and seek will be done in the escape file to the caller's requested position. Using the caller's "BUFFPTR" and "BUFFLEN" the maximum amount of audio data will be pre-loaded if a play operation is being done. Notes: ______ | ù The maximum record time is 10 minutes for voice, music, and stereo music. There is a 5 minute limit for HQ music. | ù The size of the Audio I/O buffer(s) in main memory can be greater than | 64K. If BUFFLEN is set for greater than 64K then it must be on 64K | boundaries (64K, 128K, 192K...) and the caller must allocate the | buffer and put the pointer to it in BUFFPTR. ù LIM memory can be passed for the Audio I/O buffer(s). The AAPI will always save the current LIM mapping, remap as required for the Audio I/O buffer, and then restore the original mapping before exit. No remapping for LIM is done for other data items such as FABOs. If other data used by the AAPI has been allocated in LIM, it must have the same mapping as the Audio I/O buffer, or, be mapped correctly upon entry to the AAPI and remain mapped correctly until the AAPI is termi- nated. ù An error on any ACB will terminate the processing and the AAPI will return to the caller without processing any additional ACBs. | ù During ADPCM recording the volume and balance controls are not used. | The master volume controls the monitor level of the recording but this | does not affect the recorded data. | | ù During PCM recording the master volume, volume, and balance controls | are in effect and control the monitor volume and balance as well as | the recorded data. ù Due to DSP code size and performance restrictions on the M-ACPA signal processor and the host PC not all functions can be loaded and per- formed simultaneously. The restrictions are: Ä Recording is limited to one track at a time. Ä Playing two AVC ADPCM tracks is allowed as long as neither track is using stereo or high quality audio compression. | Ä AVC ADPCM and RIFF WAVE files can not be played at the same time. 3. AAPI Functions 24 | Ä Playing two RIFF WAVE tracks is allowed as long as neither track | is stereo, mu-law, or a-law. | | Ä 44KH stereo mu-law or a-law is not allowed. | | Ä Monitoring the recorded output of an ADPCM stereo compression | track or any PCM track while it is being recorded is not allowed. | The source can be monitored during the recording but not the | result. | | Ä Source mix can only be used on one PCM track at a time. Ä Playing one track and recording a different track (dubbing) simul- taneously is not allowed. Ä MIDI audio files can only be played in OS/2, not in DOS. Ä MIDI audio files can only be played on Track B. The only audio type that can be played simultaneously with MIDI is ADPCM voice on Track A. There are two DSP environments (DSP load modules) where voice can be played, VOICE/MIDI and VOICE/(VOICE or MUSIC). Once one of the environments has been set up and is started, changing to a different environment can not be done without stopping the DSP. For example, if the VOICE/MIDI environment is loaded and VOICE is playing, only MIDI can be played simultaneously. If a MUSIC file is now to be played the DSP must be stopped and reloaded. If the VOICE/MUSIC environment had been loaded in the first place then both VOICE and MUSIC could be played without stopping the DSP. If MIDI or MUSIC is the first audio type to be requested for play, then there is no choice on which environment to load. If VOICE is the first request then the caller can pick the environment by setting on or off the "Setup VOICE/MIDI" flag in the OPRPARMS field. 3. AAPI Functions 25 3.2.5 AUD_STRT - START AN AUDIO OPERATION Description: ____________ Start audio processing for the operation requested by the caller. Synopsis: _________ int aud_strt(acbptr) /* Start audio processing */ struct acb *acbptr; /* Pointer to Audio Control Block */ Input Parameters: _________________ Input ACB: __________ Itemized below are the ACB fields that must be initialized properly before the aud_strt function is called. All other fields should have been ini- tialized on a "aud_init" call and a previous "aud_set" call. The ACB must be maintained between calls (aud_init through aud_term) with the user changing only defined input fields for a particular function call. CONTROLS - Control changes requested (Set bits individually) 01h = Master volume 02h = Track volume - 1 04h = Track volume - 2 08h = Channel balance 10h = Set stop position 40h = Pause current operation 80h = Resume current operation Settings "40h and 80h" are mutually exclusive. CTLPARMS - Control parameters flag (Set bits individually) 01h = Stop - Ignore other control requests (on) Stop - Honor pending control requests(off) Setting "01h" is ignored if CONTROLS is not set for "set stop position". 02h = Purge control queue for this channel 08h = Pause/resume all channels (on) Pause/resume only this channel (off) Setting "08h" is ignored if CONTROLS is not set for "pause" or "resume". | MASVOL - Master volume level (ADPCM 0 - 100/PCM 0 - 120) Required if "CONTROLS" is set for master volume. 3. AAPI Functions 26 TRKVOL1 - Track volume level (0 - 100) Required if "CONTROLS" is set for track volume-1. TRKVOL1S - Start position for track volume change (ms) Required if "CONTROLS" is set for track volume-1. TRKVOL1E - End position for track volume change (ms) Required if "CONTROLS" is set for track volume-1. TRKVOL2 - Track volume level (0 - 100) This field applies to exactly the same control as TRKVOL1. This second field gives the caller the ability to specify two volume fades at different positions in the audio. If the positions overlap the caller will be overriding the previous control change that was requested. Required if "CONTROLS" is set for track volume-2. TRKVOL2S - Start position for track volume change (ms) Required if "CONTROLS" is set for track volume-2. TRKVOL2E - End position for track volume change (ms) Required if "CONTROLS" is set for track volume-2. CHNBAL - Channel balance (pan) level (100 - 0) Required if "CONTROLS" is set for channel balance. OUTCHAN - Output channel pair for balance 0 = Channel left/right 1 = Channel right/left Required if "CONTROLS" is set for channel balance. CHNBALS - Start position for channel balance change (ms) Required if "CONTROLS" is set for channel balance. CHNBALE - End position for channel balance change (ms) Required if "CONTROLS" is set for channel balance. STOPPOS - Position to stop current operation (ms) Required if "CONTROLS" is set for set stop position. ACB2PTR - ACB pointer of second track to process 00000000h = Single track Output Return Values: _____________________ 3. AAPI Functions 27 0, Successful 3220, Control queue overflow, control ignored 3227, Disk full - out of space 3228, Audio object at maximum size 3229, DSP overload - Attempted to play a stereo or HQ music track and another track simultaneously 3230, Recording media too slow, recording data lost 3231, Operation cancelled, process not in foreground 3232, Audio file at maximum file length (address space) 3233, DSP running and requested operation requires reload of DSP 3234, MIDI audio type can not be played on Channel A or in DOS 3235, Device driver error occurred when writing MIDI data 3236, Operating system call failed, system inconsistency Output ACB: ___________ Itemized below are the ACB fields set by the aud_strt function. All output fields marked with "*" are updated at one tenth of a second inter- vals requiring no function call once an asynchronous audio operation (play, record) has been started. * POSITION - Current position in audio (ms) * STATE - Current state of signal processor 00h = stopped 01h = playing 02h = recording 03h = stopping * TIMELEFT - Time left before I/O necessary (ms) OUPDATED - Indication of which objects were changed by AAPI * BACKRC - Set if error occurs during an operation, between explicit calls to the AAPI. ACB2RC - Return code for second ACB operation Function: _________ 1. Any control changes requested are executed or queued for execution before the operation is started. This is true for both the primary ACB and the secondary ACB. 2. Based on the requested command(s) on a previous "aud_set" call, the requested operation(s) are started. 3. If an error occurs during the operation, but not during a explicit call to the AAPI, then the "backrc" field is set. For example, if the disk becomes full during a record operation, the operation will be stopped, and the appropriate return code set into the "backrc" field. The caller does not need to poll this return code continuously during the operation. If an error occurs, the operation will be terminated 3. AAPI Functions 28 without waiting for the caller to request a stop. The caller can simple look at this field after the operation has stopped to see if the operation completed successfully. Notes: ______ ù Play, record, and monitor are asynchronous operations. That is, they run independently until stopped by an aud_stop call. The user can monitor the progress of the operation using the ACB output fields described above. The user can control the operation (volume, I/O request) with additional calls, but the operation continues during and after these additional calls. ù Due to DSP code size and performance restrictions on the M-ACPA signal processor and the host PC not all functions can be loaded and per- formed simultaneously. The restrictions are: Ä Recording is limited to one track at a time. Ä Playing two AVC ADPCM tracks is allowed as long as neither track is using stereo or high quality audio compression. | Ä AVC ADPCM and RIFF WAVE PCM files can not be played at the same | time. | | Ä Playing two RIFF WAVE tracks is allowed as long as neither track | is stereo, mu-law, or a-law. | | Ä 44KH stereo mu-law or a-law is not allowed. | | Ä Monitoring the recorded output of an ADPCM stereo compression | track or any PCM track while it is being recorded is not allowed. | The source can be monitored during the recording but not the | result. | | Ä Source mix can only be used on one PCM track at a time. Ä Playing one track and recording a different track (dubbing) simul- taneously is not allowed. Ä MIDI audio files can only be played in OS/2, not in DOS. Ä MIDI audio files can only be played on Track B. The only audio type that can be played simultaneously with MIDI is ADPCM voice on Track A. There are two DSP environments (DSP load modules) where voice can be played, VOICE/MIDI and VOICE/(VOICE or MUSIC). Once one of the environments has been set up and is started, changing to a different environment can not be done without stopping the DSP. For example, if the VOICE/MIDI environment is loaded and VOICE is playing, only MIDI can be played simultaneously. If a MUSIC file is now to be played the DSP must be stopped and reloaded. If the VOICE/MUSIC environment had been loaded in the first place then both VOICE and MUSIC could be played without stopping the DSP. If MIDI or MUSIC is the first audio type to be 3. AAPI Functions 29 requested for play, then there is no choice on which environment to load. If VOICE is the first request then the caller can pick the environment by setting on or off the "Setup VOICE/MIDI" flag in the OPRPARMS field. 3. AAPI Functions 30 3.2.6 AUD_CTRL - CONTROL AN AUDIO OPERATION Description: ____________ Change the current audio settings or control the operation in progress. Synopsis: _________ int aud_ctrl(acbptr) /* Change audio controls */ struct acb *acbptr; /* Pointer to Audio Control Block */ Input Parameters: _________________ Input ACB: __________ Itemized below are the ACB fields that must be initialized properly before the aud_ctrl function is called. All other fields should have been ini- tialized on previous function calls. The ACB must be maintained between calls (aud_init through aud_term) with the user changing only defined input fields for a particular function call. CONTROLS - Control changes requested (Set bits individually) 01h = Master volume 02h = Track volume - 1 04h = Track volume - 2 08h = Channel balance 10h = Set stop position 20h = Perform I/O 40h = Pause current operation 80h = Resume current operation Settings "40h and 80h" are mutually exclusive. CTLPARMS - Control parameters flag (Set bits individually) 01h = Stop - Ignore other control requests (on) Stop - Honor pending control requests(off) Setting "01h" is ignored if CONTROLS is not set for "set stop position". 02h = Purge control queue for this channel 04h = This call is being made from an interrupt service routine in DOS. If DOS is "busy" and can not be re-entered then the request will not be processed and error code 3217 will be returned. 08h = Pause/resume all channels (on) Pause/resume only this channel (off) Setting "08h" is ignored if CONTROLS is not set for "pause" or "resume". MASVOL - Master volume level (0 - 100) Required if "CONTROLS" is set for master volume. 3. AAPI Functions 31 TRKVOL1 - Track volume level (0 - 100) Required if "CONTROLS" is set for track volume-1. TRKVOL1S - Start position for track volume change (ms) Required if "CONTROLS" is set for track volume-1. TRKVOL1E - End position for track volume change (ms) Required if "CONTROLS" is set for track volume-1. TRKVOL2 - Track volume level (0 - 100) This field applies to exactly the same control as TRKVOL1. This second field gives the caller the ability to specify two volume fades at different positions in the audio. If the positions overlap the caller will be overriding the previous control change that was requested. Required if "CONTROLS" is set for track volume-2. TRKVOL2S - Start position for track volume change (ms) Required if "CONTROLS" is set for track volume-2. TRKVOL2E - End position for track volume change (ms) Required if "CONTROLS" is set for track volume-2. CHNBAL - Channel balance (pan) level (100 - 0) Required if "CONTROLS" is set for channel balance. OUTCHAN - Output channel pair for balance 0 = Channel left/right 1 = Channel right/left Required if "CONTROLS" is set for channel balance. CHNBALS - Start position for channel balance change (ms) Required if "CONTROLS" is set for channel balance. CHNBALE - End position for channel balance change (ms) Required if "CONTROLS" is set for channel balance. STOPPOS - Position to stop current operation (ms) Required if "CONTROLS" is set for set stop position. IOTIME - The amount of audio data in milliseconds to read into or write from the audio buffers. 00000000h = Maximum amount of data Required if "CONTROLS" is set for perform I/O. ACB2PTR - ACB pointer of second track to process 00000000h = Single track Output Return Values: _____________________ 3. AAPI Functions 32 0, Successful 3207, "Escape" file read error 3211, "Escape" file seek error 3217, I/O not possible at this time 3219, "Escape" file write error 3220, Control queue overflow, control ignored 3235, Device driver error occurred when writing MIDI data 3236, Operating system call failed, system inconsistency Output ACB: ___________ Itemized below are the ACB fields set by the aud_ctrl function. All output fields marked with "*" are updated at one tenth of a second inter- vals requiring no function call once an operation (play, record) has been started. * POSITION - Current position in audio (ms) * STATE - Current state of signal processor 00h = stopped 01h = playing 02h = recording 03h = stopping * TIMELEFT - Time left before I/O necessary (ms) OUPDATED - Indication of which objects were changed by AAPI ACB2RC - Return code for second ACB operation Function: _________ 1. Control changes are queued for execution based on their order (01, 02, 04...) in the "CONTROLS" field and the requested start position in the audio for that control. Use separate aud_ctrl calls if a different order of execution is required. Identical control changes for dif- ferent channels requested by setting the control in both the primary and secondary ACBs will be done as closely as possible. For example, all "master volume" control changes will be queued for execution before the next control is considered. Notes: ______ ù If playing, and a request to "perform I/O" is not made before the data in the audio buffers is exhausted, the AAPI will begin asynchronously doing the minimum amount of I/O needed to keep the audio card ser- viced. Subsequent "perform I/O" requests may be able to replenish the buffers and re-synchronize the process but if this is not possible asynchronous I/O will continue to be done until the play operation is stopped. ù If recording, and a request to "perform I/O" is not made before the audio buffers are filled to capacity, the AAPI will begin asynchro- 3. AAPI Functions 33 nously doing the minimum amount of I/O needed to partially empty the audio buffers and accept more audio data from the audio card. Subse- quent "perform I/O" requests may be able to flush the buffers and re- synchronize the process but if this is not possible asynchronous I/O will continue to be done until the record operation is stopped. ù A stop control request will not stop the operation immediately if the CTLPARMS flag is set to honor other controls. Any controls requested to be executed or in progress will be allowed to complete before the stop. This means that the audio card must not be allowed to run out of audio data or overflow the audio buffers before the last control is completed. The caller must be certain that sufficient data or room is in the audio buffers before doing a stop or continue to do aud_ctrl calls. The AAPI will do I/O asynchronously to continue servicing the audio card if necessary if the caller is unable to do aud_ctrl calls. 3. AAPI Functions 34 3.2.7 AUD_TERM - TERMINATE AUDIO PROCESSING Description: ____________ Terminate audio operations for this ACB Synopsis: _________ int aud_term(acbptr) /* Terminate audio processing */ struct acb *acbptr; /* Pointer to Audio Control Block */ Input Parameters: _________________ Input ACB: __________ Itemized below are the ACB fields that must be initialized properly before the aud_term function is called. All other fields should have been ini- tialized on previous calls. The ACB must be maintained between calls (aud_init through aud_term) with the user changing only defined input fields for a particular function call. ACB2PTR - ACB pointer of second track to process 00000000h = Single track Output Return Values: _____________________ 0, Successful Output ACB: ___________ Itemized below are the ACB fields set by the aud_term function. All output fields marked with "*" are updated at one tenth of a second inter- vals requiring no function call once an operation (play, record) has been started. OUPDATED - Indication of which objects were changed by AAPI ACB2RC - Return code for second ACB operation Function: _________ 1. Any interrupt handler/device driver for the signal processor is deac- tivated. 2. Any objects that were read in by the AAPI and were not changed are purged from memory. If the object was already open, then it is left open but marked as updated if it was changed. 3. AAPI Functions 35 3. If the AAPI allocated the I/O buffer it is de-allocated. Notes: ______ 3. AAPI Functions 36 4. CONTROL DATA STRUCTURES ___________________________ The following section provides descriptions of the control blocks used with the audio control type functions of the AAPI. 4.1 AUDIO DEVICE CONTROL BLOCK (ADCB) ______________________________________ The ADCB is the structure used to retrieve configuration information about the audio device installed in the system. The only audio device currently supported is the IBM M-Audio Capture and Playback Adapter. ; ** Audio Device Control Block ** ; _________________________ adcb struc ; *** OUTPUT PARAMETERS *** device_id dw ; Device ID of audio card ; 6E6Ch = IBM M-ACPA (PS/2) iobase dw ; I/O address base intlev dw ; Hardware interrupt level os2rtn dw ; OS/2 error code res db 10 dup ; Reserved adcb ends The parameters in the AAPI Device Control Block can have the following values: 1. DEVICE_ID - The microchannel device ID of the installed adapter. 2. IOBASE - The starting address of eight I/O addresses that the adapter has been configured to use. 3. INTLEV - The hardware interrupt level that the adapter has been con- figured to use. 4. OS2RTN - If a call to OS/2 fails while accessing the device driver, the OS/2 error code will be returned in this field. 5. RESERVED - A reserved area for future expansion. 4. Control Data Structures 37 4.2 AUDIO START/STOP LIST STRUCTURE (ALST) ___________________________________________ The ALST is the structure used to pass additional start/stop times to the Audio API during play operations. ; ** Audio Start/Stop List Structure ; _________________________ alst struc ; audstart dd ; Start position for seek (milliseconds) audend dd ; End position for seek (milliseconds) nextlist dd ; Ptr to next ALST alst ends The fields in the ALST can have the following values: 1. AUDSTART - Position to start next play operation. 2. AUDEND - Position to end the play operation. 3. NEXTLIST - Pointer to next structure or null. If null the operation will end. If not null, playing will continue at the next "audstart" position. NEXTLIST may point to itself, or another structure already processed to create a loop. 4. Control Data Structures 38 4.3 AUDIO CONTROL BLOCK (ACB) ______________________________ The ACB is the primary control block used for communication between the AAPI and the calling application program. The control block has input parameters which are used to control how the AAPI interacts with the audio hardware and has output parameters which the AAPI uses to convey informa- tion back to the feature program. The control block has the following structure: ; ** Audio Control Block ** ; _________________________ acb struc ; *** INPUT PARAMETERS *** channel db ; Input channel identifier ; 00h = Channel A ; 01h = Channel B | ; FFh = Select and return channel dspmode db ; Signal processor operation ; 01h = Play ; 02h = Record ; 03h = Monitor seektype db ; Type of seek in audio object ; 00h = Init/seek in new object ; 01h = Seek in current object ; 02h = Continue in current object ; 03h = Release resources only ; 04h = Release, then init and seek audtype db ; Type of audio compression to use ; 00h = Current/Default ; 01h = ADPCM 11.0K (music) ; 02h = ADPCM 5.5K (voice) ; 03h = ADPCM 22.0K (stereo music) ; 04h = ADPCM 22.0K (High Qual music) ; 60h = PCM intlevel db ; Audio card interrupt level inpsrce db ; Audio card input source ; 00h = Microphone input - normal ; 01h = Line level input - left ; 02h = Line level input - right ; 03h = Line level input - both ; 04h = Microphone input - low piobase dw ; Audio card base address controls dw ; Control changes requested ; (Set bits individually) ; 01h = Master volume ; 02h = Track volume - 1 ; 04h = Track volume - 2 ; 08h = Channel Balance ; 10h = Set stop position ; 20h = Perform I/O ; 40h = Pause current operation ; 80h = Resume current operation 4. Control Data Structures 39 oprparms dw ; Operation parameters ; (Set bits individually) ; 02h = Do not update audio object ; totals during record operations ; 04h = PS/2 speaker output enabled ; 10h = Monitor, record pre-compression ; 20h = Monitor, recrd post-compression ; 200h = Cancel operation if process is ; not in foreground ; 400h = Setup VOICE/MIDI DSP ; environment if voice is started ; first fileptr dd ; File Access Block pointer subtype dw ; FABO subtype of objects audstart dd ; Start position for seek (milliseconds) ; 00000000h = Start of data audend dd ; End position for seek (milliseconds) ; 00000000h = End of data bufflen dd ; I/O buffer length buffptr dd ; I/O buffer pointer emmhan dw ; Expanded memory manager handle emmcnt dw ; Expanded memory manager page count memid dw ; Memory type id ; 00h = not allocated ; 01h = main memory ; 02h = LIM memory db 6 dup ; (reserved -- 00h) ctlparms db ; Control parameters ; (Set bits individually) ; 01h = Stop - Ignore other ctrls (on) ; Stop - Honor pending ctrls(off) ; 02h = Purge control queue for this ; channel before processing ; new controls ; 04h = This call is being done from ; within an interrupt service ; in DOS. ; 08h = Pause/Resume all channels (on) ; Pause/Resume only this ch (off) | ; 10h = Restore previous controls 4. Control Data Structures 40 masvol db ; Master volume level | ; ADPCM (0-100) PCM (0-100) trkvol1 dw ; Track volume level 1 (0 - 100) trkvol1s dd ; Start position for trk vol1 change(ms) trkvol1e dd ; End position for trk vo1l change (ms) trkvol2 dw ; Track volume level 2 (0 - 100) trkvol2s dd ; Start position for trk vol2 change(ms) trkvol2e dd ; End position for trk vo12 change (ms) chnbal db ; Channel balance (pan) level (100 - 0) outchan db ; Output channel pair for balance ; 0 = Channel Left/right ; 1 = Channel Right/Left chnbals dd ; Start position for chan bal change (ms) chnbale dd ; End position for channel bal change(ms) stoppos dd ; Position to stop operation (ms) iotime dd ; Length of audio read/write (ms) ; 00000000h = Maximum length acb2ptr dd ; ACB pointer of second track to process ; 00000000h = Single track listptr dd ; Ptr to list of start/stop positions dsp_path dd ; Ptr to directory path for DSP code | svrsptr dd ; Ptr to save/restore area for controls | fmtptr dd ; Ptr to audio format data | in_reser db 20 dup ; Reserved area for input parm expansion ; *** OUTPUT PARAMETERS *** position dd ; Current position in audio (ms) state db ; Current state of signal processor ; 00h = stopped ; 01h = playing ; 02h = recording ; 03h = stopping backrc dw ; Background process return code acb2rc dw ; Return code for second track operation timeleft dd ; Time left before I/O necessary (ms) oupdates dw ; Audio objects updated flag ; (Set bits individually) ; 01h = AUDIO object changed ; 02h = VOLUME object changed ; 04h = ESCAPE object changed out_rese db 8 dup ; Reserved area for output parm expansion | db 256 dup ; AAPI work area acb ends The parameters in the AAPI control block can have the following values: 1. CHANNEL - channel/track identifier This parameter selects the channel on the signal processor where the operation will be performed. ù 0 - Channel A 4. Control Data Structures 41 ù 1 - Channel B | ù 255 - The AAPI will select and return available channel in this | field. 2. DSPMODE - Mode for audio signal processor ù 1 - Initialize and load, if not already loaded, signal processor for play operations. ù 2 - Initialize and load, if not already loaded, signal processor for record operations. ù 3 - Initialize and load, if not already loaded, signal processor for monitor operations. 3. SEEKTYPE - Type of processing to do on the audio objects. ù 0 - Initialize for audio objects in passed FAB. Seek to position requested in "AUDSTART". ù 1 - Seek to position requested in "AUDSTART" in current audio object(s). ù 2 - Continue in the current audio object(s) with no seek. ù 3 - Release audio object resources allocated by AAPI. No other processing is done. Assumes FAB information is still correct. ù 4 - End processing on current audio object(s), and initialize for audio objects in passed FAB. Seek to position requested in "AUDSTART". Assumes that previous FAB is still allocated and valid. 4. AUDTYPE - Type of compression method to use when recording, or moni- toring. ù 00h - Use method in current object, or default for new object ù 01h - Use ADPCM/11K algorithm (mono music) ù 02h - Use ADPCM/5.5K algorithm (mono voice) ù 03h - Use ADPCM/22K algorithm (stereo music) ù 04h - Use ADPCM/22K algorithm (mono high quality music) | ù 60h - Use PCM algorithm 5. INTLEVEL - The hardware interrupt level the audio card has been con- figured to use. 6. INPSRCE - The input source the audio card should consider active. 4. Control Data Structures 42 ù 00h - Microphone input - normal gain ù 01h - Line level input - left channel only ù 02h - Line level input - right channel only ù 03h - Line level input - both channels ù 04h - Microphone input - low gain 7. PIOBASE - The starting address of a block of eight I/O addresses that the audio card has been configured to use. 8. CONTROLS - flag indicating control changes to execute ù 01h - Set the master volume based on caller's input ù 02h - Set the track volume based on caller's input ù 04h - Set the track volume (at a different position) ù 08h - Set the channel balance based on caller's input ù 10h - Set the position in the audio to stop current operation ù 20h - Refresh the I/O buffer if necessary ù 40h - Pause the current operation until a "resume" control is requested. ù 80h - Resume a "paused" operation. 9. OPRPARMS - flag indicating parameters for DSP operations. ù 02h - Do not update audio and volume object segment totals during record operations. The variable sections of these objects will be updated normally but fields in the header will not be changed by the AAPI. ù 04h - Enables the PS/2 speaker for output from the audio card. This has no effect on the other output sources (line, speaker). This flag is ignored if not running on a PS/2 system. ù 10h - Enables pass through of any input to the card, while recording, to the output channels, before any compression is done. ù 20h - Enables pass through of any input to the card, while recording, to the output channels, after compression has taken place. ù 200h - If the process that the Audio API is running in is switched to the background (user task switch for example) the current oper- ation will be cancelled and an error will be returned to the caller (OS2 only). 4. Control Data Structures 43 ù 400h - If playing VOICE mode audio is requested first, then setting this flag will cause the DSP code environment to be set up to play VOICE and MIDI simultaneously. If the flag is not set then the DSP environment will be set up for VOICE and (VOICE or MUSIC). 10. FILEPTR - A pointer to a File Access Block for an audio file. This pointer allows access, using FABOs, to the various audio objects. 11. SUBTYPE - The sub-type of the objects to operate on. This would normally be zero if only one track of audio data is in the file. 12. AUDSTART - The start position in milliseconds of where to begin the requested operation in the audio track. A value of zero will be con- sidered the physical start of the track. 13. AUDEND - The end position in milliseconds of where to stop the requested operation in the audio track. A value of zero will be con- sidered the physical end of the track. 14. BUFFLEN - The length of the caller passed I/O buffer (buffptr), or the length that the AAPI should allocate from main memory. 15. BUFFPTR - A pointer to a buffer to be used for I/O operations by the AAPI. 16. EMMHAN - The extended memory manager handle to use if the caller passes a LIM memory pointer (BUFFPTR). 17. EMMCNT - The extended memory manager page count to use if the caller passes a LIM memory pointer (BUFFPTR). 18. MEMID - The type of memory passed by the caller in "BUFFPTR". ù 00h - not allocated ù 01h - main memory ù 02h - LIM memory ù 03h - Device 19. RESERVED - A reserved area for additional information about the call- er's passed memory area. 20. CTLPARMS - flag indicating parameters for control requests ù 01h - Stop request will ignore other controls in progress or pending and execute at the time requested in STOPPOS. If the bit is off the stop will not execute until the time requested (STOPPOS) and all other controls in progress or pending have com- pleted. 4. Control Data Structures 44 ù 02h - Purge this channel's control queue before processing any new controls requested in CONTROLS. ù 04h - This call is being made from within an interrupt service routine in DOS. The call will not be processed if DOS is "busy" and not re-entrant. ù 08h - Based on what "CONTROLS" value is set (pause or resume) the current operation for all channels will be paused or resumed if this flag is set. If the flag is not set then only the current channel will be affected. | ù 10h - Requests that any previously saved control information be | restored. This is used to restart a play or record operation in | the same position with the same control settings as when it was | stopped. 21. MASVOL - For ADPCM, a value from zero to one hundred specifying the percentage of maximum volume that can be used at this time. For example, a setting of 70 would mean that only 70% of the maximum | volume the hardware supports will be allowed. When playing and | recording PCM the upper limit is 120%. Setting values over 100% cause | the volume level of the track being recorded or played to be boosted | above its original recorded level. 22. TRKVOL1 - A value from zero to one hundred specifying the percentage of the maximum volume to be used while playing this track. The maximum volume allowed is based on what the hardware can support and the current setting of the master volume (MASVOL) control. 23. TRKVOL1S - The start position in milliseconds of where to begin fading the track volume level (TRKVOL1) in the audio track. 24. TRKVOL1E - The end position in milliseconds of where to complete the track volume level change in the audio track. Subtracting TRKVOL1S from TRKVOL1E will give the amount of time in milliseconds that will be used while fading from the current track volume setting to the requested volume setting (TRKVOL1). 25. TRKVOL2 - A value from zero to one hundred specifying the percentage of the maximum volume to be used while playing this track. The maximum volume allowed is based on what the hardware can support and the current setting of the master volume (MASVOL) control. 26. TRKVOL2S - The start position in milliseconds of where to begin fading the track volume level (TRKVOL2) in the audio track. 27. TRKVOL2E - The end position in milliseconds of where to complete the track volume level change in the audio track. Subtracting TRKVOL2S from TRKVOL2E will give the amount of time in milliseconds that will be used while fading from the current track volume setting to the requested volume setting (TRKVOL2). 4. Control Data Structures 45 28. CHNBAL - A value from zero to one hundred specifying the percentage of the signal balance that is sent to the first channel in the two channels named in "OUTCHN". The second channel receives the remainder of the signal. 29. OUTCHAN - A value that indicates the pair of channels that "CHNBAL" is dividing the signal between. ù 00h - Left channel(signal % in CHNBAL), Right channel (remaining %) ù 01h - Right channel (signal % in CHNBAL), Left channel (remaining %) 30. CHNBALS - The start position in milliseconds in the audio track of where to begin fading from the current balance setting to the requested balance setting (CHNBAL). 31. CHNBALE - The end position in milliseconds in the audio track of where to complete fading from the current balance setting to the requested balance setting (CHNBAL). 32. STOPPOS - The position in milliseconds in the audio track of where to stop the current operation. 33. IOTIME - The amount of audio data (in milliseconds) that the caller is requesting to be read into or written from the I/O buffer (BUFFPTR). A zero value will cause the maximum amount of data possible to be read or written. 34. ACB2PTR - A pointer to a second Audio Control Block. The AAPI will attempt to perform the requested operations in both ACBs as simultane- ously as possible. This field allows two channel play and dub oper- ations to be done with one call. The return code for the second ACB is in the output field "ACB2RC". 35. LISTPTR - A pointer to a structure (linked list) of start/stop posi- tions to use during play operations. This list is used after the position in "AUDEND" is reached. 36. DSP_PATH - A pointer to a directory path to use to find the signal processor code files. If this field is zero then the current direc- tory is used. | 37. SVRSPTR - A pointer to a save/restore area to be used to store the | current control state (volume, balance, etc). | | 38. FMTPTR - A pointer to a audio format area containing information about | the type of audio compression being performed. 39. IN_RESERVE - A 20 byte area reserved for expansion of the input param- eters. 4. Control Data Structures 46 40. POSITION - The current position (in milliseconds) in the audio data. This field is updated at .1 second intervals by the AAPI during asyn- chronous audio operations. 41. STATE - The current state of the signal processor. ù 00h - Stopped. The signal processor is halted. ù 01h - Playing. The signal processor is playing. ù 02h - Recording. The signal processor is recording. ù 03h - Stopping. The signal processor has received a stop request but is completing an operation before stopping. 42. BACKRC - Return code set by the AAPI when running in the background. After an operation has been started, and the AAPI detects an error (device full for example), the operation will be stopped and this field will contain the error code. 43. ACB2RC - The return code from a second ACB passed in field "ACB2PTR". 44. TIMELEFT - Time remaining before the AAPI begins asynchronous I/O. 45. OUPDATES - Audio objects updated flag ù 01h = AUDIO object changed ù 02h = VOLUME object changed ù 04h = ESCAPE object changed 46. OUT_RESERVE - An eight byte area reserved for expansion of the output parameters. 47. AAPIWORK - A reserved area for the AAPI work area. | 4.4 AUDIO FORMAT STRUCTURE (AFMT) ___________________________________ | The AFMT is the structure used to pass additional information about the | audio type. Currently it is only used with the PCM audio type. The field | values marked below with (W) are valid for RIFF WAVE files. If the caller | uses other values (marked with (i)) that are valid in the AAPI but not yet | defined for the RIFF WAVE format then the file will be saved as a RIFF | file but with a form type of "ibmw" instead of WAVE. 4. Control Data Structures 47 | ; ** Audio Format Structure | ; _________________________ | afmt struc | ; | format dw ; Format of audio data | ; 01h = Linear PCM (W) | ; 02h = mu-law (W) | ; 03h = A-law (W) | samples_per_second dd ; Sample rate in hertz | ; 8000, 11025, 22050, 44100 (W) | bits_per_sample dw ; Sample width in bits | ; 8 or 16 (W) | channels dw ; Number of channels | ; 1 or 2 (W) | sample_number_format dw ; Format of sample | ; 02h = Unsigned (W) | ; (8 bits per sample) | ; 01h = Signed | ; 00h = 2's complement (W) | ; (16 bits per sample) | dither_percent dw ; Amount of dither in % of | ; one bit (Rec/monitor only) | ; 21h = Recommended value | format_flag dw ; General flag | ; 01h - Source Mix on | reserved db 20 dup ; Reserved for expansion | afmt ends | | See Appendix C, "Additional Information on PCM Support" on page 71 for a | detailed explanation of the above fields. 4. Control Data Structures 48 5. USAGE EXAMPLES __________________ 5.1 CHECK AUDIO HARDWARE AND SOFTWARE CONFIGURATION ____________________________________________________ 1. Verify that signal processor microcode file is located in a path that can be found when the application is executed. 2. Allocate an Audio Device Control Block to pass to the AAPI. 3. Call AUD_CFIG passing a pointer to the ADCB. 4. Check return code ù If 0, this system has an available audio device. The ADCB con- tains the hardware configuration information. ù If 3224, no audio device was found. ù If 3225, an audio device was found, but interrupts are disabled. ù If 3226, the system has an audio device, but the device driver is not responding. The OS/2 error code is available in the ADCB. 5.2 PLAY (SINGLE CHANNEL) __________________________ | 1. Determine type of audio file (FAB_TYPE). 2. Allocate and initialize FAB/FABOs and open the audio file (FAB_OPEN, FAB_ROPN, or application file handling code) 3. Allocate an I/O buffer and an ACB to pass to the AAPI. 4. Call AUD_INIT passing the ACB pointer. 5. Call AUD_SET with the appropriate parameters. These parameters can include setting volume fade-in and fade-out controls and the position in the audio to stop playing. 6. Call AUD_STRT when ready to start playing. Monitor the following output fields in the ACB which will be updated at .1 second intervals: ù POSITION - Current position in audio. ù TIMELEFT - Time left before AAPI will begin doing asynchronous I/O to refill the audio buffers. 7. Call AUD_CTRL* as many times as necessary to change audio controls and/or to request that I/O be performed to refill the audio buffers. 5. Usage Examples 49 8. Call AUD_SET to continue or start a new operation or AUD_TERM to ter- minate audio processing. If AUD_TERM is called all audio objects that were not already open before the AUD_INIT call will be closed. * Note: Audio control requests can be made on all AAPI calls except AUD_INIT and AUD_TERM by setting the "CONTROLS" parameter in the ACB. 5.3 RECORD/MONITOR ___________________ | 1. Determine type of audio file (FAB_TYPE). 2. Allocate and initialize FAB/FABOs and open the audio file (FAB_OPEN, FAB_ROPN or application file handling code) 3. Allocate an I/O buffer and an ACB to pass to the AAPI. 4. Call AUD_INIT passing the ACB pointer. 5. Call AUD_SET with the appropriate parameters. 6. Call AUD_STRT when ready to start recording. Monitor the following output fields in the ACB which will be updated at .1 second intervals: ù POSITION - Current position in audio. ù TIMELEFT - Time left before AAPI will begin doing asynchronous I/O to flush the audio buffers. 7. Call AUD_CTRL as many times as necessary to request that I/O be per- formed to flush the audio buffers or to set the stop point in the audio. 8. Call AUD_SET to continue or start a new operation or AUD_TERM to ter- minate audio processing. If AUD_TERM is called all audio objects that were not already open before the AUD_INIT call will be closed. 5.4 PLAY (TWO CHANNEL) _______________________ Method 1 - Use the single channel example above for both channels. The caller must do a separate call for each channel for all requests. Method 2 - Pass a secondary ACB | 1. Determine type of audio file for each channel (FAB_TYPE). 2. Allocate a FAB and open an audio file (FAB_OPEN, FAB_ROPN) for each channel. 3. Allocate an I/O buffer and an ACB for each channel. 5. Usage Examples 50 4. Call AUD_INIT passing the ACB pointer and the second ACB pointer in ACB2PTR. Both ACBs will be initialized. 5. Set the appropriate parameters in both ACBs and call AUD_SET. 6. Call AUD_STRT when ready to start playing. Monitor the following output fields in the ACB(s) which will be updated at .1 second intervals: ù POSITION - Current position in audio. ù TIMELEFT - Time left before AAPI will begin doing asynchronous I/O to refill the audio buffers. 7. Call AUD_CTRL* as many times as necessary to change audio controls and/or to request that I/O be performed to refill the audio buffers. 8. Call AUD_SET to continue or start a new operation or AUD_TERM to ter- minate audio processing. If AUD_TERM is called all audio objects that were not already open before the AUD_INIT call will be closed. * Note: Audio control requests can be made on all AAPI calls except AUD_INIT and AUD_TERM by setting the "CONTROLS" parameter in the ACB. 5. Usage Examples 51 6. AVC AUDIO FILE FORMAT _________________________ 6.1 AUDIO FILE OVERVIEW ________________________ The following section provides an overview of the AVC file structure that is used by the AAPI. Detailed descriptions of the data structures that make up these files is presented in 6.2, "File Data Structures" on page 55. 6.1.1 AUDIO FILE TYPES There are two types of files associated with a piece of AVC audio. An "AUDIO" file and an "ESCAPE" file. Together, these files make up a single entity that can be created, opened, and manipulated. This document, when discussing an "audio file" will always mean the "AUDIO" file. When refer- ring to the "ESCAPE" file, "escape file" will always be used. 6.1.1.1 "AUDIO" File The "AUDIO" file contains general information about the file itself and sub-components called "objects". The file contains a header and a vari- able length internal directory locating the objects. The header contains information necessary to verify that it is an audio file, read the internal directory, and recreate external directories to the objects within the file. Each directory entry identifies an object, its size, and its location within the file. 6.1.1.2 "ESCAPE" FILE The "ESCAPE" file contains digitized audio data. All information about the data and how to access it are contained in objects in the audio file. 6.1.2 AUDIO OBJECT TYPES Each object has a distinct type code. The type code identifies its par- ticular format. An object can also be further distinguished by a sub-type field which indicates further differences as to usages by an application. The sub-type field also distinguishes between multiple objects of the same type in the same file. Each object consists of a header and data section. The header has the same data structure for all objects of the same type and provides suffi- cient information to process the data section. An audio file does not necessarily contain all objects, only the "Audio", "Escape", and "Volume" objects are required. The objects may appear in any order in the file. 6. AVC Audio File Format 52 There are five types of audio objects. Figure 1 on page 54 shows the structure and relationships of the audio files and objects. A brief description of each object type and how it is used by the AAPI is given below. "AUDIO" Object The "AUDIO" object is a standard object that contains general information about, and an index to, a collection of digitized audio data. The "unique to object" section includes the following information about the audio data: ù Length of data (milliseconds and bytes) ù Encoding scheme ù Indexing scheme The "variable section" includes the following information about the audio data: ù Array of indexes into audio data. Each index points to an audio segment. An audio segment is an addressable unit of audio. The AAPI uses this object to locate and read audio data for playing, and updates this object during record. "AUDVOL" Object The "AUDVOL" object is a standard object that contains volume information. This volume information corresponds to the indexed segment information in the "AUDIO" object. Each indexed segment has a corresponding volume entry. The "unique to object" section includes the following information about the audio data: ù Number of volume data entries. The "variable section" includes the following information about the audio data: ù Volume information about one segment of audio data The AAPI updates this object during record operations. "ESCAPE" Object The "ESCAPE" object is a special type of object that has normal header sections but no variable section. The "ESCAPE" object points to a sepa- rate "escape" file. The contents of this file is the digitized audio data. 6. AVC Audio File Format 53 ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿ ³ ³ ³ AUDIO File ³ ³ ³ ³ ³ ³ ³ ³ ³ "AUDIO" Object ³ ³ ³ ³ ³ ³ General information about audio data ³ ³ ³ Index to individual audio segments ³ ³ ³ (one index entry for each segment) ³ ³ ³ ³ ³ ³ "AUDVOL" Object ³ ³ ³ ³ ³ ³ Volume data for audio segments ³ ³ ³ (one volume entry for each audio segment) ³ ³ ³ ³ ³ ³ "ESCAPE" Object ³ ³ ³ ³ ³ ³ Information about separate "escape" file ³ ³ ³ ³ ³ ³ "AUDPNTS" Object ³ ³ ³ ³ ³ ³ Application program information ³ ³ ³ ³ ³ ³ "AUDLABL" Object ³ ³ ³ ³ ³ ³ Application program information ³ ³ ³ ³ ³ End of AUDIO File ³ ³ ³ ³ ³ ³ ³ ³ ESCAPE File ³ ³ ³ ³ ³ ³ Digital audio data segments ³ ³ ³ ³ ³ End of ESCAPE File ³ ³ ³ ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ Figure 1. Audio Files and Objects The "unique to object" section includes the following information about the audio data: ù "Escape" file access information The "variable section" includes the following information: ù None The AAPI reads this data during play, and updates this object during record. 6. AVC Audio File Format 54 "AUDPNTS" Object The "AUDPNTS" object is a standard object that contains application infor- mation about the audio data. The "unique to object" section includes the following information about specific points in the audio. ù Number of points listed in variable section The "variable section" includes the following information about the audio data: ù Point position (milliseconds) ù Application information The AAPI does not use this object but will allocate and initialize it if requested. "AUDLABL" Object The "AUDLABL" object is a standard object that contains a subset of the information in "AUDPNTS" object. The "unique to object" section includes the following information about specific points in the audio. ù Number of points listed in variable section The "variable section" includes the following information about the audio data: ù Point position (milliseconds) ù Application information The AAPI does not use this object but will allocate and initialize it if requested. 6.2 FILE DATA STRUCTURES _________________________ The following section provides detailed information on the data structures needed to create and manipulate audio files both on disk and in memory. The AAPI provides functions to perform high level file processing using these structures, but the caller must still have an understanding of the structures to do more complicated operations. The caller need not use the AAPI file functions at all, replacing them with his own, as long as the file data is kept in the correct format when presented to the AAPI. 6. AVC Audio File Format 55 6.2.1 DIRECTORY CONTROL BLOCK This is a representation of the organization of an audio file on disk. It consists of a header and an array of directory entries, each entry repres- enting an object in the file. The overall directory control block is: Directory Control Block Directory header First directory entry Next directory entry . . Last directory entry End of Directory Control Block ; ** Directory Control Block Header ** ; _________________________ dcbh struc ; sig db 8 dup ; Signature and null terminator vers dw ; Version/Modification ftype dw ; File type eod dd ; Total bytes of data reserve1 dd ; Reserved - compatibility ents dw ; Number of directory entries activ dw ; Number of active directory entries reserve2 db 12 dup ; Reserved - compatibility name db 64 ; ASCII-z reference code reserve3 db 36 dup ; Reserved - compatibility reserve4 db 21 dup ; Reserved - expansion reserve5 db 3 dup ; Reserved - compatibility dcbh ends The fields in the Directory Control Block Header are defined below. Any field where specific values are not given should be set to zero for com- patibility with other products. Likewise, any reserved fields should be set to zero. 1. SIG - A standard literal string identifying the file for a particular product. The signature for the AVC product is "+A+V+C+". 2. VERS - Indicates the level of the file, permitting recognition and processing of down-level files. 3. FTYPE - A code to characterize the file in general terms. Audio files shoud be set to 0x0500. 4. EOD - Total bytes is the data area spanned by the file from the begin- ning of the directory to the end of the physically last object. The offset following the end of this last object is available for writes of new or updated objects. 6. AVC Audio File Format 56 5. RESERVE1 - Reserved for product compatibility. 6. ENTS - Number of directory entries is a count enabling functions opening a file to allocate the proper amount of memory to hold the entire directory. Null directory entries, if any, are included in this count. 7. ACTIV - Number of active directory entries is a count of the non-null entries. Active entries physically appear first in the directory. 8. RESERVE2 - Reserved for product compatibility. 9. NAME - Fully-qualified file name. 10. RESERVE3 - Reserved for product compatibility. 11. RESERVE4 - A reserved area for future expansion. 12. RESERVE5 - Reserved for product compatibility. ; ** Directory Control Block Entry ** ; _________________________ dcbe struc ; type dw ; Object type code subtype dw ; Object subtype code reserve1 db 2 dup ; Reserved - compatibility sizh dw ; Size of object header in file sizv dd ; Size of data section in file size dd ; Size of object in file reserve2 db 4 dup ; Reserved - compatibility offset dd ; Objects offset in file reserve3 db 8 dup ; Reserved- expansion dcbe ends The fields in the Directory Control Block Entry are defined below. Any field where specific values are not given should be set to zero for com- patibility with other products. Likewise, any reserved fields should be set to zero. 1. TYPE - Classifies the object and indicates the layout of the object's header. The object type and sub-type codes for a null entry are 0000. See the FAB type field for valid audio object types. 2. SUBTYPE - See the FAB subtype field for valid audio object subtypes. 3. RESERVE1 - A reserved area for product compatibility. 4. SIZH - Number of data bytes in the file for the header section. 5. SIZV - Number of data bytes in the file for the data section. 6. SIZE - Total number of bytes for both the header and data section. 6. AVC Audio File Format 57 7. RESERVE2 - A reserved area for product compatibility. 8. OFFSET - The object's offset from the start of the file. 9. RESERVE3 - A reserved area for future expansion. 6. AVC Audio File Format 58 6.2.2 FILE ACCESS BLOCK (FAB) This is the representation of an audio file in memory. It has a header and immediately following, a list of file objects -- referred to as the FABO list. Unlike the directory, however, the FABO list does not neces- sarily identify all objects in the file; it only contains those of interest to the application. The overall directory control block is: File Access Block File Access Block Header FABO - first object FABO - next object . . FABO - last object End of File Access Block ; ** File Access Block Header ** ; _________________________ fabh struc ; fabbit dw ; FAB status flag fabcnt dw ; Number of FABO'S in FABO list fabhdl dw ; File handle fabver dw ; Version/modification number reserve1 db 2 dup ; Reserved - compatibility fabtyp dw ; File type reserve2 db 10 dup ; Reserved - compatibility fabnam db 64 dup ; ASCII-z file name reserve3 db 8 dup ; Reserved- expansion fabstat dw ; FAB commit flag fabh ends The fields in the File Access Block Header are defined below. Any field where specific values are not given should be set to zero for compat- ibility with other products. Likewise, any reserved fields should be set to zero. The fields in the File Access Block Header can have the following values: 1. FABBIT - Status flags - None currently defined 2. FABCNT - Number of FABO's is a count of elements in the FABO array. 3. FABHDL - Operating system file handle for audio file. 4. FABVER - Equivalent to the directory version number. 5. RESERVE1 - A reserved area for product compatibility. 6. FABTYP - The following types are valid: 6. AVC Audio File Format 59 ù Audio File - 0x0500 7. RESERVE2 - A reserved area for product compatibility. 8. FABNAM - ASCII-z reference string (name of file) 9. RESERVE3 - A reserved area for future expansion. 10. FABSTAT - 0x0080 - FAB has been committed ; ** File Access Block Object ** ; _________________________ fabo struc ; fostat dw ; Object status flags 0x0001 = Data section modified 0x0002 = Data section in memory 0x0010 = Header section modified 0x0020 = Header section is in memory 0x0080 = Object is on disk 0x0100 = Data section is allocated 0x0200 = Header section is allocated 0x2000 = Escape file open fotype dw ; Object type code fosub dw ; Object subtype fosizh dw ; Size of header section fosizv dd ; Size of data section reserve1 db 5 dup ; Reserved - compatibility fohdptr dd ; Pointer to header section fabo ends The fields in the File Access Block Object are defined below. Any field where specific values are not given should be set to zero for compat- ibility with other products. Likewise, any reserved fields should be set to zero. 1. FOSTAT - Process and error flags. See above for bit values. 2. FOTYPE - The following types are valid: ù Audio - 0x0500 ù Volume - 0x0501 ù Points - 0x0502 ù Labels - 0x0503 ù Escape - 0x8500 3. FOSUB- Object subtype is the same as the directory entry. 6. AVC Audio File Format 60 4. FOSIZH - Size of the header in memory is always 16 bytes larger than the size on disk, since the memory header contains memory location information about the data piece of the object. 5. FOSIZV - Size of the data section. 6. RESERVE1 - A reserved area for product compatibility. 7. FOHDPTR - A pointer to the header section. The header in memory in turn contains a pointer to the data section. 6. AVC Audio File Format 61 6.2.3 AUDIO OBJECTS Each object consists of a header and data section. All objects of a par- ticular type have the same header size and layout. Among objects of the same type, data sections tend to differ from one another is size, though not in general structure. An object's header has three parts, the first two which are common to all objects, no matter what the object type. The first is a set of run_time fields with memory location information for the object's data section; these fields exist only in the memory representation of the object and are not saved on disk. The second part is a standard prologue whose principal use is to assist in finding the object in memory or file dumps. The rest of the header is data unique to the particular object type. The overall layout of an object header is: Object Header Common memory section Common prologue Unique to object section End of Object Header ; ** Object Header Common Memory ** ; _________________________ obmem struc ; dataptr dd ; Pointer to start of data section | reserve1 db 26 dup ; Reserved - expansion | mem_id dw ; Memory type ; 0x00 = Not allocated ; 0x01 = Main memory ; 0x02 = LIM memory ; 0x03 = Device fabh ends The fields in the Object Header Common Memory section are defined below. Any field where specific values are not given should be set to zero for compatibility with other products. Likewise, any reserved fields should be set to zero. 1. DATAPTR - The address of the data section of the object. 2. RESERVE1 - A reserved area for expansion 3. MEM_ID - Type of memory for data section. See above for types. 6. AVC Audio File Format 62 ; ** Object Header Common Prologue ** ; _________________________ obmem struc ; obj_ids db 8 dup ; Visual ID and null terminator obj_ver dw ; Version cmp_typ dw ; Compression type reserve1 db 4 dup ; Reserved fabh ends The fields in the Object Header Common Prologue are defined below. Any field where specific values are not given should be set to zero for com- patibility with other products. Likewise, any reserved fields should be set to zero. 1. OBJ_ID - A seven character literal string that identifies a particular object type. The following IDs are valid: ù Audio Object - "AUDIO " ù Volume Object - "AUDVOL " ù Points Object - "AUDPNTS" ù Labels Object - "AUDLABL" ù Escape Object - "ESCAPE " 2. OBJ_VER - Indicates the level of the object, permitting functions to recognize and possibly process down_level formats. 3. CMP_TYPE - A code applicable to objects whose data section is stored in compressed form in the file. It indicates a particular compression algorithm. 4. RESERVE1 - A reserved area for expansion. 6.2.3.1 Object Descriptions The unique header fields and data sections are different for each object and are described below. AUDIO OBJECT DESCRIPTION: The audio header occupies 64 bytes in a file and 80 in memory. 6. AVC Audio File Format 63 ; ** Audio Object Header Structure ** ; _________________________ audobjh struc ; com_mem db 32 dup ; Common memory structure com_pro db 16 dup ; Common prologue structure aud_time dd ; Length of audio in milliseconds aud_end dd ; Offset to end of segment data aud_segm dw ; Length of segment in milliseconds aud_segb dw ; Length of segment in bytes aud_dion dw ; # of segment index entries aud_dil dw ; Length of a segment index entry (bytes) aud_blks dw ; Total number of physical segments aud_grb dw ; Total number of garbage segments aud_fg1 db ; Format flag ; 0x80 = Fragmented ; 0x40 = Sequential ; 0x20 = Silence segment at start of ; escape file aud_fg2 db ; Coding type ; 0x80 = Fixed rate coding ; 0x40 = Variable rate coding ; 0x20 = Variable segment size ; 0x10 = Variable segment time aud_comp dw ; Compression method ; 0x00 = Default - ADPCM/11K ; 0x01 = ADPCM 11.0K (mono) ; 0x02 = ADPCM 5.5K (mono) ; 0x03 = ADPCM 22.0K (stereo) ; 0x04 = ADPCM 22.0K (mono) ; 0x64 = AVC MIDI aud_mcid dw ; Microcode ID ; 0x00 = Pre-release version ; 0x01 = Post-release (V1.02 and up) reserve1 db 22 dup ; Reserved audobjh ends An audio object's data section is a simple continuous array of 3-byte offsets to the digital audio, based on the premise of fixed rate coding. The index entry size allows for 24 minutes of audio when digitizing at 11k bytes per second. ; ** Audio Object Data Structure ** ; _________________________ audobjd struc ; aud_dof db 3 dup ; Offset to digital data segment audobjd ends VOLUME OBJECT DESCRIPTION: The audio header occupies 32 bytes in a file and 48 in memory. 6. AVC Audio File Format 64 ; ** Volume Object Header Structure ** ; _________________________ volobjh struc ; com_mem db 32 dup ; Common memory structure com_pro db 16 dup ; Common prologue structure aud_voln dw ; # of volume data entries reserve1 db 14 dup ; Reserved for expansion volobjh ends A volume object's data section is a simple array of 1-byte volume codes. ; ** Audio Object Data Structure ** ; _________________________ volobjd struc ; aud_vol db ; Volume data ; Stereo - Left Nibble = Left track ; Right Nibble = Right track ; Mono - Right Nibble = Valid ; Left Nibble = Undefined ; Sum of ; bits 0,1,2 / Input RMS Voltage ; 0 / 0 ; 1-3 / .001 - .06 ; 4 / .06 - .12 ; 5 / .12 - .24 ; 6 / .24 - .47 ; 7 / .47 - 2.4 ; Bit 3 ; On = Clipping occurred volobjd ends POINTS OBJECT DESCRIPTION: The points header occupies 32 bytes in a file and 48 in memory. ; ** Points Object Header Structure ** ; _________________________ pntobjh struc ; com_mem db 32 dup ; Common memory structure com_pro db 16 dup ; Common prologue structure aud_plon dw ; Number of point list entries reserve1 db 14 dup ; Reserved for expansion pntobjh ends A point object's data section is an array of 76-byte audio points 6. AVC Audio File Format 65 ; ** Audio Points Data Structure ** ; _________________________ pntobjd struc ; aud_ptm dd ; Point position in time (milliseconds) aud_plb db 6 dup ; Label for point - ASCII-z aud_pcm db 41 dup ; User annotation of point - ASCII-z aud_ppc db 8 dup ; Mix program - command - ASCII-z aud_ppl db 5 dup ; Mix program - level - ASCII-z aud_ppt db 5 dup ; Mix program - time - ASCII-z aud_pfl db ; Point flag reserve1 db 6 dup ; Reserved for expansion pntobjd ends LABELS OBJECT DESCRIPTION: The labels header occupies 32 bytes in a file and 48 in memory. ; ** Labels Object Header Structure ** ; _________________________ labobjh struc ; com_mem db 32 dup ; Common memory structure com_pro db 16 dup ; Common prologue structure aud_lbln dw ; Number of labels entries reserve1 db 14 dup ; Reserved for expansion labobjh ends A point object's data section is an array of 12-byte audio labels ; ** Audio Points Data Structure ** ; _________________________ pntobjd struc ; aud_ltm dd ; Point position in time (milliseconds) aud_lbl db 6 dup ; Label for point - ASCII-z reserve1 db 2 dup ; Reserved for expansion ESCAPE FILE REFERENCE OBJECT DESCRIPTION: An escape file contains data which is too big and unwieldy to be handled within a normal data section of an object. The escape file is associated with an audio file, and the audio file contains an "escape file reference object" to describe the sep- arate escape file. An escape file reference object has only a header; there is no data section (this is the data in the escape file). The header occupies 112 bytes in the file and 128 in memory. 6. AVC Audio File Format 66 ; ** Escape File Reference Object Header Structure ** ; _________________________ escobjh struc ; com_mem db 32 dup ; Common memory structure com_pro db 16 dup ; Common prologue structure reserve1 db 2 dup ; Reserved - compatibility esc_sig dw ; Signature flag ; 0x00 = Standard signature ; 0x01 = No signature reserve2 db 8 dup ; Reserved - compatibility esc_ref db 64 dup ; ASCII-z name of escape file esc_hdl dw ; Escape file handle esc_thl dw ; Temporary escape file handle reserve3 db 16 dup ; Reserved for expansion escobjh ends The first 32 bytes of the escape file is a file signature with the fol- lowing layout: ; ** Escape File Signature Structure ** ; _________________________ escsgnh struc ; sgn_sgn db 8 dup ; ASCII-z signature ; AVC = "+A+V+C+" sgn_ver dw ; Version modification sgn_type dw ; File type ; 0x8000 = Escape file reserve1 db 8 dup ; Reserved - compatibility reserved db 12 dup ; Reserved - expansion escsgnh ends 6. AVC Audio File Format 67 7. RIFF WAVE AUDIO FILE FORMAT _______________________________ 7.1 AUDIO FILE OVERVIEW ________________________ The AAPI supports the Microsoft Resource Interchange File Format (RIFF) Waveform Audio File Format (WAVE). Users should refer to the Microsoft Multimedia Development Kit for details of this format. The AAPI supports the Microsoft Resource Interchange File Format (RIFF) Waveform Audio File Format (WAVE). Users should refer to the Microsoft Multimedia Development Kit for details of this format. If the AAPI file level functions (FAB_TYPE, FAB_ROPN, FAB_SAVE, FAB_CLOSE) are used then detailed knowledge of the RIFF WAVE format should not be necessary. Once RIFF WAVE files have been read into memory they are represented using the AVC file format structures so that the AAPI audio control functions can work transparently on both types of files. 7. RIFF WAVE Audio File Format 68 APPENDIX A. OS/2 CONSIDERATIONS ________________________________ A.1 OS/2 USAGE ________________ The OS/2 version of the AAPI is shipped as a dynamic link library (DLL) and an accompanying import library. The user statically links the appli- cation with the import library and then places the DLL somewhere in the operating system's DLL path. All calls to the DLL must be far calls and all pointers passed must be far data pointers. A.2 OS/2 DEVICE DRIVER ________________________ When running an application using the AAPI under OS/2 the following state- ments must be added to the CONFIG.SYS file: DEVICE=\Your_Device_Path\M-ACPA_OS/2_Device_Driver_Name IOPL=YES The device driver must be installed even if the DOS version of the AAPI is being used in the DOS Compatibility box. The compatibility device driver for AVC must be installed if the AAPI application is going to be used in OS/2 or any other application other than the AAPI is going to access the M-ACPA card. If AAPI applications only are going to be run in the DOS box then the minimal device driver may be used. See the M-ACPA Device Driver document for more details. Appendix A. OS/2 Considerations 69 APPENDIX B. DOS CONSIDERATIONS _______________________________ B.1 DOS USAGE _______________ In DOS the AAPI is delivered as a large model C library that should be statically linked with the application. The AAPI routines were compiled using the Microsoft C 6.00 Compiler. B.2 DOS DEVICE DRIVER _______________________ When running an application using the AAPI under DOS the following state- ment must be added to the CONFIG.SYS file: DEVICE = \Your_Device_path\M-ACPA_DOS_Device_Driver_Name The minimal or full function M_ACPA driver may be used. If no other audio applications are using the M_ACPA then use the minimal driver. If more than one application other than AAPI applications may be using the M_ACPA then use the full function device driver. See the M-ACPA Device Driver documentation for more details. B.3 DOS REENTRANCY ____________________ Using the Audio API under DOS will result in the Audio API interrupt handler being invoked at random times. Since DOS is not reentrant, the interrupt handler checks the DOS critical section flag before using any DOS services (such as for disk I/O). The AAPI caller should not structure the application program in a way that leaves this critical section flag on for long periods of time. This will prevent the interrupt handler from using DOS services. For example, using DOS calls in a very tight loop to poll the user for input could cause this situation. B.4 EXPANDED MEMORY _____________________ If expanded memory is being used, the LIM device driver must meet certain performance requirements and be re-entrant for the AAPI to perform ade- quately. All device drivers we have tested have been acceptable with the exception of the DOS 4.0 XMAEMS/XMAEM combination on 386 machines. XMAEMS on 286 machines does have acceptable performance. Appendix B. DOS Considerations 70 APPENDIX C. ADDITIONAL INFORMATION ON PCM SUPPORT __________________________________________________ C.1 SIGNIFICANCE OF THE DIFFERENT PCM MODES ____________________________________________ PCM or Pulse Code Modulation is a method of encoding sound information as a series of numerical sound samples. The current PCM support allows the user to record and play data at a variety of sample widths and sample rates. The choice of which sample rate and width to use is essentially an economic trade-off between storage size and sound quality. C.1.1 SAMPLE RATE The choice of sample rate determines the highest frequency which can be represented by the sampled sound data. Consider the problem of sampling a relatively low frequency sound signal: Low Frequency Sound ------------------- * * * * | * * | | * * | | * * | | * * | | * *| | | * * | | | * * | | | |* * | | | | | | | | | V V V V Sound Samples Clearly the low frequency sound pictured above can be represented nicely if we take samples at the regular rate shown. What about a higher fre- quency sound? Appendix C. Additional Information on PCM Support 71 High Frequency Sound -------------------- * * * * * * * * * * * * * | * |* * * * * * * * | | * | * * |* * * * * * | | | | * * | * * |* * * * | | | | | * * | * * | * * |* * | * | * | * | * V V V V Sound Samples Clearly the higher frequency sound shown above can not be represented by samples taken at the rate shown. The low points of the signal between each sample would be missed. That is, if you did sample the sound at the rate shown, and then tried to reconstruct the signal from the samples, what you would get would look like this: High Frequency Sound -------------------- * A * | * | A * * | | * | | A * * | | | * * | | | A | | | | | | | | | | | | Sound Samples Which does not look like the original signal at all. The rule here is that the theoretical highest frequency which can be represented by a series of samples is the frequency equal to half of the sampling rate. This frequency is called the Nyquist frequency. As it turns out, however, the real application of PCM recording and play- back is slightly more restrictive even than this. If you again examine the figures above showing the result of attempting to sample a frequency higher than the Nyquist frequency, you will notice that not only is the reconstructed signal not the original, it is in fact a different signal. This effect is called "aliasing." Essentially, as one attempts to sample a frequency higher than the Nyquist frequency, what one gets is an "image" which is a tone whose frequency is equal to the sample rate minus the frequency of the original tone. For example: Appendix C. Additional Information on PCM Support 72 ù You select a record sampling rate of 11025 samples per second. ù The Nyquist frequency is 11025/2 = 5512 Hertz. ù You attempt to sample a tone of 7000 Hertz. ù The tone recorded will "look like" a tone of 11025 - 7000 = 4025 Hertz. In order to prevent this sort of "aliasing," the PCM support microcode must digitally filter out frequencies higher than the Nyquist frequency for each sampling rate. In an ideal world (of unlimited free digital signal processing power) one could have a so-called "brick-wall" filter which would leave frequencies below the Nyquist frequency completely undisturbed and also absolutely and completely eliminate any frequency above the Nyquist frequency. Real digital filters, however, are not so perfectly precise and must be designed with a certain amount of safety. The approximate bandwidth of the current PCM support for the M-ACPA at different sample rates is shown below: Sample Rate Maximum Frequency ----------- ----------------- 8000 samples/sec 3000 Hertz 11025 samples/sec 4450 Hertz 22050 samples/sec 9200 Hertz 44100 samples/sec 20000 Hertz C.1.2 SAMPLE WIDTH The sound quality is also greatly affected by the number of bits used to store each sound sample. The choices for storing sound samples are: ù 8-bit integer ù 16-bit integer ù effective 14-bit integer (mu-Law) (see C.2, "Mu-Law and A-Law Companding" on page 76) ù effective 13-bit integer (A-Law) (see C.2, "Mu-Law and A-Law Companding" on page 76) The significant effect of the number of bits used is the amount of back- ground hiss perceived by the listener. This hiss is actually the audible effect of quantization error or the error introduced when the continuously variable sound level is assigned to the nearest integer for storage as a sample. Appendix C. Additional Information on PCM Support 73 Consider the following portion of a sound signal which we would like to sample: Nearest Integer Sample Value | V * 37 * * * 36 * * * 35 * * * * 34 * In order to sample the signal we will have to determine at fixed intervals which of the available integer sample values was nearest to the signal: Nearest Integer Sample Value | V * 37 - - - - - - - - * - * * 36 - - - - - - - - - - * * * 35 - - - * - - - - - - * * * 34 * - - - - - - - - - The resulting signal ends up sort of "squared off" looking something like this: Appendix C. Additional Information on PCM Support 74 Nearest Integer Sample Value | V * | 37 - - - - - - - .------*--' - * | * | 36 - - - - - - .--------' - - | * * * | 35 - - .---------*-------------' - - - - | * * * | 34 *--------' - - - - - - - - However, the new "square" version of the signal is definitely not the same as the old smooth version. In fact, the new version is equivalent to the sum of the original version of the signal and an error signal which looks like this: 1/2 * * * * * * * * * 0 * * * * * * * * * * * * -1/2 * * * This error is in fact random and uniformly distributed on the interval between -1/2 and 1/2. If you consider the error to be a signal, it has an RMS value of 1/(2 * square-root(3)) or approximately 0.3. Now, if we use 8-bit numbers to represent the full scale of our sampled sound, in the best case, a signal can range from -128 to +127. If the sound is a pure tone, or sine wave, it has a RMS value of 127/square-root(2) or approximately 90. In this case, the volume of the noise represents 0.3/90 or 1/3 percent of the volume of the signal. Alternatively stated, the signal to noise ratio is 20 * log10 ( 90 / 0.3) = 49.5 dB. The signal is 300 times louder than the noise. In the lucky case where you are recording only sounds of all the same volume (loud) this noise will not be a problem. Unfortunately, most reasonably good music has a much wider range of small and large sounds than this. The sound of a symphony orchestra going full bore is on the order of 10,000 times louder (from a sound pressure level point of view) than the sound of the lone piccolo playing. Clearly if we set our sample scale to be able to capture the full orchestra, the sound of the piccolo will be lost in the quantization noise if we use an 8-bit scale. Appendix C. Additional Information on PCM Support 75 The solution to this problem is to use a 16-bit scale. Now our loudest signal can range from -32768 to 32767. The error, however, still is on the order of 0.3, just like before. The full scale sine wave will now have a RMS value of 32767/square-root(2) or approximately 23,170. In this case, the ratio of the loudest signal to the noise is 23,170/0.3 which is approximately equal to 77,200. We now have more than enough dynamic range to take care of the the piccolo to full-orchestra span of volume. The the- oretical maximum signal to noise ratio is 20 * log10 (77,200) = 98 dB (approximately). There are two things to note about this number: 1. The actual performance of the M-ACPA is less than this. However, expe- rience here is that the noise performance of of the M-ACPA is good enough so that most of the time other factors such as noise in the cables, power-line hum in the amplifier, and so on are sufficient to mask any noise introduced by the M-ACPA. 2. Some less-than-scrupulous stereo manufacturers (almost all of them actually) manage to use convoluted assumptions like: "the customer will be listening to nothing but square waves" to skew this calcu- lation to show even higher signal to noise numbers for their CD players. 3. In point of practice, in any environment other than an anechoic chamber with several thousand dollars worth of the very best high fidelity studio audio equipment, anything better than about 50 dB of signal to noise ratio sounds wonderful to the average listener. C.2 MU-LAW AND A-LAW COMPANDING ________________________________ The problem with using 16-bit integer samples rather than 8-bit integer samples is that doing so doubles the amount of disk space required to store the information. The companies involved in designing digital telephone transmission equip- ment ran up against this problem long ago and have developed a solution which works remarkably well. They use a non-linear scale to code the sound samples into 8-bit values. Appendix C. Additional Information on PCM Support 76 Mu-Law Companding Curve ----------------------- 127 + - --- 8 | - | - b | i | - t | 63 | - v | - a | l | - u | e |- 0 +-----------+----------+----------+---------- 2048 4096 6144 8159 14-bit signed sample value The actual digital to analog converter uses either a 13 or 14 bit signed integer scale. The number obtained is then mapped to an 8-bit number as shown for storage or transmission. There are two mapping curves in common use in the telephony world for this purpose: Mu-Law Maps a signed 14-bit number to an 8-bit number. Mu-law is used in the telephone networks of the United States, Canada, Japan, Hong Kong, Taiwan, and Korea. A-Law Maps a signed 13-bit number to an 8-bit number. A-law is used in the telephone networks of all countries other than the ones which use mu-Law. Both of these companding schemes effectively achieve an apparently higher signal to noise ratio by intentionally distorting louder signals. The assumption here is that most of what is being recorded is smaller scale signals most of the time. For example: 1. With 8-bit linear encoding, the quantizing error has an RMS value of 0.3 regardless of whether the input signal has an RMS value of 90 (loud) or an RMS value of 5 (rather soft). For the soft signal, the hiss is 0.3/5 = 6 percent as loud as the signal which is very notice- able with good audio equipment. That is, the small signal would have a signal to noise ratio of only 24 dB. 2. By comparison, with mu-law encoding, the largest possible signal will have an RMS value of 8159/square-root(2)(1) which is approximately 5768. However, the steps at the upper end of the mu-law curve are in units of 256. The RMS value of the quantization noise will therefore ÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ (1) The mu-law scale does not go quite all the way to 8192. Appendix C. Additional Information on PCM Support 77 be 128/square-root(3) or about 74. This will give a signal to noise ratio for the loud signal of about 20 * log10(5768/74) = 38 dB. 3. The comparable small signal is 5 / 128 * 8159 = 318 on the mu-law scale. In this range, the mu-law companding curve has a step size of 16. The RMS value of the quantization noise will therefore be 8/square-root(3) or about 4.6. This will give a signal to noise ratio for the loud signal of about 20 * log10(318/4.6) = 36 dB. The net effect of mu-law companding is to evenly scale the quantization noise so that the signal to noise ratio is always about 37 dB regardless of how loud or soft the signal is. Mu-Law and A-Law are always 8-bit and can be supported at any sample rate stereo or mono except 44100 samples per second stereo. Only a single track of mu-Law or A-law PCM can be played back at one time. The exact definitions of mu-Law and A-Law companding are available in recommendation G.711 of the CCITT (the United Nations body which sets international telephony agreements). C.3 DITHER ___________ For applications which require recording of 8-bit linear data (and to a lesser extent 14-bit mu-law and 13-bit A-law data) the use of dither can significantly improve the quality of the recording of softer signals. The problem alleviated by dither is that of the extreme relative dis- tortion of signals whose magnitude is on the order of one bit. Consider a pure tone whose amplitude is just around one bit. 3 * * * * * * * * * * * * * * * * * 2 * * * * * (The tone shown above has a peak to peak amplitude of 1 and a DC offset of 2.5) When quantized, this signal will tend to turn very square: Appendix C. Additional Information on PCM Support 78 3 .------*--------. .------*---------. | * * | | * * | | | | * * | * * | | | | | * | *| * | * | | | | | | * * | *| | | | | 2 ----' '--*-------*-----' '-*------ * * A square wave like this essentially represents the original tone with a tremendous amount of harmonic distortion thrown into it. The tone will sound very rough and grating. Further problems arise if the amplitude of the tone is varying slightly over time. If the amplitude of the signal shown above were to decrease slightly, it would simply disappear. A short while later it might reappear again as the amplitude once again became greater than one bit. This phenomenon can be heard easily with good audio equipment by making an 8-bit recording (with dither disabled - 0 percent) of something like a French Horn solo trailing delicately off to silence. As the sound dies away, a very unpleasant crackling noise will be heard. Dither attacks this problem by inserting a small amount of random noise into the signal before quantizing from 16-bits down to 8-bits (or down to 14 or 13 bits in the case of mu-Law and A-Law). The size of the signal is user selectable to be scaled to a range of up to +/- one bit. The default range for dither is +/- 1/3 bit. That is, in the normal case, the M-ACPA microcode generates a series of random numbers which have magnitudes between -1 and 1 and then multiplies them by (the user specified value of) 33/100 and adds them to the signal before quantizing to 8 bits. This dither has the effect of randomizing the transitions between quantization levels. The signal shown above might look like this if dithered with 1/3 bit of noise before quantization: 3 .---. *.-------. .--. .*---------. | *| | * | | | * | * | | | | | | | | * | | * | | * | * | | | | | | | | | * | | | *| |* | | * | | | | | | | | | | | | * * | | *| | | | | | | | | 2 ----' '---' '--*-------*--' '-----' '-*------ * * Dither has two effects in this instance: Appendix C. Additional Information on PCM Support 79 1. The shape of the resulting wave is now no longer regular. This irreg- ular random shape tends to eliminate the harmonics of the fundamental tone present in the square wave. 2. Because of the occasional addition of +1/3 bit to a high point on the signal or -1/3 bit to a low point on the signal, the signal will not die out completely until it falls below a peak to peak amplitude of 1/3 bit, considerably smaller than without dither. What is not intuitively obvious but is born out in theory and in practice is that the original tone is still present in the randomized and quantized signal shown above! That is, if you average the irregular dithered signal over time, the average will converge to the original tone again. The net effect is that by adding some additional background hiss, small signals will die out smoothly. If the French Horn solo recording test described above is repeated with 50 percent dither, the sound of the horn will fade gently into the background hiss without breaking up. Dithering should be quite useful for those needing to produce 8-bit linear recordings of as high a quality as possible. Note, however, that the actual level of the dither is really a matter of subjective opinion. There is no "right" answer. The more dither you add, the better the signal sounds, but the more background hiss you hear as well. Many textbooks suggest 33 percent as a good compromise and this is the default value. C.4 VOLUME, BALANCE, RAMP AND PAN __________________________________ The M-ACPA supports the use of volume controls for both record and play- back. In the record case, the volume controls affect the volume of the data being recorded as well as the volume of the monitor. In the playback case, there is a separate and independent set of volume controls for each track. C.4.1 MASTER VOLUME The master volume is the overall volume setting for the track. 1. Once recording or playback has begun, the master volume can not be changed. 2. The scale for master volume is 0 to 116. a. The master volume scale is a logarithmic scale. b. The default master volume is 100. c. A value of 100 neither amplifies nor attenuates a signal. A full scale signal coming into the M-ACPA input ports will be recorded as a full scale signal in the sound file. A full scale signal in the sound file, will be played as a full scale signal from the M-ACPA output ports. Appendix C. Additional Information on PCM Support 80 d. An increase of 8 in the master volume setting doubles the voltage of the signal being output by the M-ACPA. On record, an increase of 8 in the master volume setting doubles the numerical value of the samples being stored. e. The maximum track volume setting of 116 increases the voltage of the signal being played by 4 which increases its power by 16. The maximum of 116 represents an amplification of +12dB. f. An decrease of 8 in the master volume setting halves the voltage of the signal being output by the M-ACPA. On record, an decrease of 8 in the master volume setting halves the numerical value of the samples being stored. The ability to increase the volume is useful for some audio devices which can not supply a standard full scale audio signal to the M-ACPA line in ports. Note, however, that by amplifying the input signal by 12 dB, you are effectively losing 2 bits of resolution in your signal. From the point of view of quantization noise, you are now dealing with a 14-bit signal rather than a 16-bit signal. This effect, however, does not seem to be perceptible even with very high quality audio equipment. C.4.2 TRACK VOLUME Track volume is used to modify the master volume. The track volume can be changed during record or playback. The track volume ranges from 0 to 100 percent and indicates the percentage of the current master volume to use. Note that this is a linear modification of the logarithmic master volume. Setting master volume to 100 and track volume to 50 percent, will set the volume as though you had set the master volume to 50. C.4.3 RAMP RATE A ramp (up or down) can be initiated by changing the current track volume and specifying a number of milliseconds for the change to take place. For example: 1. The master volume is set to 100. 2. The track volume is set to 100. 3. The track volume is changed to 50 and a ramp time of 5000 milliseconds is specified. 4. After 1 second the effective track volume will have changed to 90. 5. After 2 seconds the effective track volume will have changed to 80. 6. After 5 seconds the ramp will be complete and the new effective track volume level will be 50. Appendix C. Additional Information on PCM Support 81 The ramp will sound very smooth and continuous to a human ear even though the actual voltage levels are changing in a logarithmic fashion; such is the nature of the human ear. It is possible to interrupt a ramp in progress and set a new target level to ramp to. In this case, the ramp will be between the last effective level (when the ramp in progress was interrupted by a new change in track volume) and the new track volume. The new ramp will take the number of milliseconds specified at the time the new ramp is initiated. C.4.4 BALANCE The balance control can be thought of as "percentage of the way to the right." The default of 50 percent means that the volume is split evenly between right and left. 100 percent means that the volume is all the way to the right; 0 percent indicates all the way to the left. The basic operation of balance is to modify track volume which in turn modifies master volume. The actual operation of the balance, however, is not so simple. The major complication arises from the human perception of sound levels. Whereas the human perceives absolute sound levels in a logarithmic fashion, unfortunately the human perceives relative balance of sound levels (left to right) in a linear fashion! If we are ever to hope to make a pan (side to side) operate smoothly like a ramp (up and down) this difference in human perception forces us to use a non-linear balance scale to compensate for the logarithmic master volume scale. The effect of balance settings on volume looks like this: 116 +L R | L R | L R 100 + - - - R - - - - | | R | L | | R | L | | R | L | | R | L | | R | L 0 +------------------------------------------------ 0 50 100 Percent Right ------------- At the default of 50 percent, balance has no effect on volume. As the volume shifts to one side, it increases the volume on that side above 100 percent while decreasing the volume on the other side. Appendix C. Additional Information on PCM Support 82 Note that the maximum is still 116. If the master volume is set to 116 and the balance is moved to one side, the balance curve will limit itself as it gets to 116. This effect will not impair the sense of balance, but it will degrade the smoothness and evenness of a panning action. C.4.5 PAN RATE Pan rate is to balance what ramp rate is to track volume. A pan (left or right) can be initiated by changing the current balance and specifying a number of milliseconds for the change to take place. For example: 1. The master volume is set to 100. 2. The track volume is set to 100. 3. The balance is set to 50. 4. The balance is changed to 0 and a pan time of 5000 milliseconds is specified. 5. Balance will pan smoothly to the left over 5 seconds. The pan will sound reasonably smooth and continuous to a human ear unless the application has also simultaneously requested the amplification of the signal (master volume > 100). C.5 SOURCE MIXING __________________ Source mixing can be used to add the input from either of the M-ACPA input ports into the output of PCM being played back. 1. Source mix can either be played on its own track or together with a PCM file on a track. 2. Since stereo playback requires single track operation, source mix must be played on the same track as the PCM to mix over a stereo track. 3. Playing source mix on its own track with another track of mono PCM allows each to have independent volume, balance, ramp and fade con- trols. 4. The options for input for source mix are microphone, low gain micro- phone, or stereo line in (both). Input from only stereo line left or only stereo line right is not supported. 5. If stereo line both is used as an input and the PCM file being played is stereo, the source mix will be stereo as well. That is, the left channel of the source will be mixed with the left channel of the PCM data and the right channel of the source will be mixed with the right channel of the PCM data. Appendix C. Additional Information on PCM Support 83 All other combinations of input and PCM data will cause source mix to be monophonic. Appendix C. Additional Information on PCM Support 84