Troubleshooting and Documentation

Content by Michael Lang (original archived HERE). Edited by Major Tom.
Last content update: 09 Dec 2000


This page offers you background information in order to solve problems and trouble with your IBM MCA SCSI-subsystem, the driver and Linux. Examples are given and command line parameters are described, too.

No Reasons for Alert

Not everything that looks on the first stage like a problem or a severe error is a real error. There exist several known troubles that may appear, but are not entangled with the driver or MCA-Linux.
  • When it boots, it hangs after the line IBM MCA SCSI: Scanning SCSI-devices. or up from version 3.2a ... cleared, .
    When your driver is hanging here for some time during boottime or during loading it as module, this is completely normal. Depending upon your adapter type and the amount of SCSI devices and if you have multiple LUN probing selected or not in the kernel configuration, it can take up to 2 minutes, until the whole PUN/LUN space of a SCSI-bus is scanned for valid devices. If a SCSI-device is not found at the first attempt, the adapter has to wait for a certain timeout period, which is granted to all SCSI-devices by the ANSI specifications. If until the end of that timeout period, nothing is answering, there is no device present and that is, what the driver marks on its internal PUN/LUN map.
  • It hangs forever during boottime.
    This can happen, if some of your SCSI-devices have failures. You should check for this by booting with some othe operating system, like DOS or OS/2. If it fails to boot with other OSes, too, it is obviously no software problem. If other cases happen, please refer to the next section.

Debugging or Non-Debuging Mode

When booting Linux, there exist two major different ways, the driver could show up during boottime. If debugging inside the driver code is activated, you get quite verbose information about the different activities of the driver during initialization. If you would like to get rid of the verbose output, you need to change the line

#define IM_DEBUG_PROBE
into
#undef  IM_DEBUG_PROBE
or vice versa, if you have troubles to activate the verbose debugging mode. In debugging mode, a lot of useful information is given and this can help you and the maintainer to get things right. In the following paragraphs, text written in teletype style shows what you see on the screen. Comments to the output are kept in default script. The following paragraph will show you all important available messages. Some of them are only shown while debugging mode is activated. The ordinary Linux-Kernel-Distribution contains always a driver release, where debugging is switched-off, so it shows already minimum output by default.

IBM SCSI-Subsystem Driver Messages

IBM MCA SCSI: Version 3.2pre2
This is the first message you get from the driver and it shows the current driver-version, that is running in your kernel. As example, 3.2pre2 is displayed. If you get the line
IBM MCA SCSI: No Microchannel-bus support present --> Aborting.
              Enable MCA-Bus support in the Kernel-Config, first!
instead, the driver will not work, as no Microchannel-Bus support is activated in the kernel. This must be activated, as the driver needs to access the MCA-POS-registers of the SCSI-adapter(s).

IBM MCA SCSI: IBM SCSI Adapter w/Cache found in slot 1, io=0x3558, scsi id=7,
              ROM Addr.=0xc0000, port-offset=0x18, subsystem=enabled.
is an example for a valid and detected IBM SCSI-adapter or controller. When slot-cards are found, their ROM Address and the port-offset is shown. The port-offset is essential and is added by the driver to the base 0x3540. Via this port, every communication with the SCSI-adapter is done. In addition, the id of the adapter is shown. As interrupt, the shared interrupt 14 is used by default. This is not shown during boottime. Oncoming development may make it necessary to allow selection for interrupt usage between 14 and 11, as interrupt 14 is default or fixed for all IBM SCSI-adapters, except of the SCSI-2 Fast/Wide Adapter/A, which may use interrupt 11. If the driver cannot assign the correct I/O region to the adapter, it is possible, that there is a conflict in the hardware setup. For such a case check the settings on your PS/2 using the reference disk. Such a message that you may get could look like this:
IBM MCA SCSI: Unable to get I/O region 0x3540-0x3547 (8 ports).
If Linux refuses to register the SCSI-adapter, presented to Linux by the driver, you will get the messge
IBM MCA SCSI: Unable to register host.
which appears, if there is a bug in the SCSI-host structure, that Linux does not like. In such cases, your hardware should be fine, but drivercode/Linux makes the trouble.

If the probing for IBM SCSI-adapters fails at all, you do not have any IBM SCSI-subsystem installed (or if you are sure, there exist still recognition problems from the driver side). That is possible, as some PS/2s were delivered with Future Domain or Adaptec 1640 controllers. In such a case, after some seconds you would get the message:

IBM MCA SCSI: No IBM SCSI-subsystem adapter attached.

If an intergrated SCSI-controller is found, the port is fixed to 0x3540 and the ROM Address is at 0xc0000. The chip-revision and some vendor-bit info is shown instead.

IBM MCA SCSI: Control Register contents: 3, status: 4
The control-register contents could have various values during startup. This is one way to check for adapter-health during boottime.

IBM MCA SCSI: This adapters' POS-registers: ff 8e 7 fc 22 ff ff ff
The POS-registers are 8 bytes, that contain information about the hardware inserted into a certain slot on the Microchannel-Bus. This information is of vital importance to probe for position and type of hardware on the bus. The first two POS registers contain the type-unique adapter-id. This two-byte expression describes the product type. Every different MCA-hardware must have a different adapter-id. The IBM SCSI-adapter w/Cache has for example 0x8eff.

If you get the line

IBM MCA SCSI: Subsystem SCSI-commands get bypassed
it means, that your SCSI-commands get bypassed by the driver. There exist two ways of sending commands to the SCSI-devices through the IBM adapters. The first one is to use the adapter-integrated commands, that will work at maximum possible performance, but are not recommended by IBM to be used on sequential SCSI-devices. If you have problems with the driver or your adapter has some problems, you may set the commandline-parameter bypass to force the driver not to use any adapter-intelligency. So, built-in commands are not used, even if it would make sense. If this message does not appear, that is the default and your SCSI-subsystem operates at maximum performance, if it is in fine health.

IBM MCA SCSI: Sync.-Transfer-Rate: 5.00 MHz, Timeout: 45 s
If you run IBM SCSI-subsystems at maximum speed, this line shows 5 MHz of synchronous data-transfer-rate, except for SCSI-2 Fast/Wide Adapter/A, which can cope with 10 MHz. The default timeout value is set to 45 seconds at boottime, which is the factory default. This timeout applies to all SCSI-devices and the adapter itself. If you use the commandline parameters fast, medium or slow, you can make the adapter go slower than maximum speed. The maximum synchronous data-transfer-rate is the default. The adapter negotiates synchronous data-transfer on a per-command-base.
IBM MCA SCSI: Current SCSI-host index: 0
The driver is capable to handle up to 8 different IBM SCSI-adapters on the Microchannel-Bus in one single PS/2 system. If you have multiple adapters, they get different host-indices, that is from 0 to 7.
IBM MCA SCSI: Removing default logical SCSI-device mapping................
Linux likes to know, at which physical unit number (PUN) and at which logical unit number (LUN) of a physical unit, SCSI-access is possible in order to be enabled to send commands there and to make the bootup configuration easy. IBM SCSI-subsystems operate with some internal device recognition, wich hides this PUN/LUN map of the SCSI-bus completely to Linux. When power is switched on (during POST), the SCSI-adapters probe on their own for devices on the SCSI-bus and assign so called logical device numbers (ldn) to all valid found PUN/LUN combinations up to 15. 16 ldns are available, where the highest ldn is always used for the adapter itself. A solution to this problem is to remove all pre-assigned ldns again and to do the PUN/LUN probing by feet, where the driver controls the process and creates a map of the PUN/LUNs on the SCSI-bus, which it presents to Linux and which corresponds to the physical reality on the SCSI-bus. This mapping is done in three steps. The line above is shown, while the ldn-setting gets deleted. Every ldn makes a (.) on the screen, when debugging is activated.
IBM MCA SCSI: Probing SCSI-devices......
This line now checks through all possible PUN/LUN combinations along the SCSI-bus. If a PUN/LUN combination is valid, it is added to the physical SCSI-bus map inside the driver-memory. Every (.) is printed in debugging mode for a checked PUN/LUN combination.
IBM MCA SCSI: Mapping SCSI-devices......
This line is shown, when the 15(16) available ldns are distributed among the valid PUN/LUN combinations. If there are more than 16 valid PUN/LUN combinations, dynamical reassignment of the ldns during runtime is necessary to have full access to all SCSI-devices. On the first stage, the first 8 ldns are given to PUN=0..7 and LUN=0. The remaining ldns are filled for PUN=0..7 for LUN=1 and so on, if no device has valid LUNs greater than 0. If there are LUNs greater than 0, the remaining ldns get accumulatd on them first, before beeing spread somewhere to some invalid PUN/LUN. During runtime, no ldn is left free to prevent any intervention of the SCSI-adapters intelligency.
IBM MCA SCSI: Device order: New Industry Standard (pun=0 is first).
The remapping of SCSI-devices inside the driver allows the possibility to inverse the drive order presented to Linux. It makes no difference, if some device is known as PUN=6 instead of PUN=1, as long as it is kept in that way consequently. Here, it is possible to adapt the philosophy of the user. In the ANSI-SCSI-specs, the devices with the highest PUN-value get the highest priority on the SCSI-bus. These should be the harddisks. Today, some new industrial adapters and products think, that the lowest PUN-value should have the highest priority. To stop this endless discussion on what seems to be the better method for this driver, there are two ways, that can be chosen by the user, either while compiling the kernel or as commandline. When you put the commandline-parameter normal you will get the new industrial standard as shown by the above line, where PUN=0 is the first device on the bus to be checked by the operating system. Other OSs would call this drive C:. If you place ansi instead, the correct specs are kept and PUN=7 is the first device to be scanned. In such a case you will get the line:
IBM MCA SCSI: Device order: IBM/ANSI (pun=7 is first).

After having checked all possible combinations of PUN/LUNs and setting up the driver upon the desires of the hardware and the user, the following maps are shown, if debugging mode is enabled:

IBM MCA SCSI: Determined SCSI-device-mapping:
    Physical SCSI-Device Map               Logical SCSI-Device Map
ID\LUN  0  1  2  3  4  5  6  7       ID\LUN  0  1  2  3  4  5  6  7
 0      T  +  +  +  +  +  +  +        0      0  7  e  -  -  -  -  -
 1      -  -  -  -  -  -  -  -        1      1  8  -  -  -  -  -  -
 2      R  +  +  +  +  +  +  +        2      2  9  -  -  -  -  -  -
 3      -  -  -  -  -  -  -  -        3      3  a  -  -  -  -  -  -
 4      D  +  +  +  +  +  +  +        4      4  b  -  -  -  -  -  -
 5      D  +  +  +  +  +  +  +        5      5  c  -  -  -  -  -  -
 6      D  +  +  +  +  +  +  +        6      6  d  -  -  -  -  -  -
 7      A  A  A  A  A  A  A  A        7      f  f  f  f  f  f  f  f
The contents of these tables may vary upon the devices installed on your SCSI-bus. The shown maps are an example for having a tapedrive at PUN=0, a CD-ROM at PUN=2 and three harddisks at PUN=4..6. The left table contains the physical SCSI-device mapping connected to the IBM SCSI-subsystem. The devicetype is checked for later command handling. The letters shown report the detected devicetypes and have the following meaning:

LetterDescription
APUN/LUN combinations occupied/reserved by the IBM SCSI-subsystem
DValid harddisk (random access)
TValid tapedrive (sequential access)
PValid SCSI-processor device
WValid WORM-CD-writer/reader
RValid ordinary CD-ROM drive
SValid SCSI-Scanner
MValid Magneto-Optical (MO) drive
CValid Medium Changer / Jukebox / Syquest-drive
+PUN/LUN combination occupied by a SCSI-device, but not provided
-PUN/LUN combination without any connected device

The right table shows how the ldns got distributed among the valid PUN/LUN combinations. The logical map can change during runtime, when there are more than 15(16) valid PUN/LUN combinations connected to the SCSI-bus due to dynamic reassignment. Therefore, /proc/scsi/ibmmca/N contains the always actual maps.


Kernel Panics

The most popular panic that appears is the message

IBM MCA SCSI: Fatal errormessage from the subsystem (0x...,0x...)!
This message appears always, when something is going wrong with the SCSI-driver or the IBM SCSI-subsystem. Up to now, most reported problems came originally from the driver and were based on the fact, that the SCSI-subsystem got bad input-data from the driver. These bugs should be more or less history now and this panic focuses back to the old function to indicate, if bad things happened while the SCSI-subsystem processed some command. There exist three possible failures, that may be recognized by the SCSI-subsystem hardware. These are:
  • Hardware Failure
      The IBM SCSI-subsystem has detected an internal error in the hardware. This kind of error is in principal the death scream of your SCSI-adapter. When such error appears, the only thing you can check is to see if the adapter-card is well inserted into the MCA-slot and if the CMOS-battery of the mainboard is still ok. If both seems to be ok, it could still be an error in the driver, but this is very improbable, as these bugs always get reported as Command Error.
  • Software Sequencing Error
      The software (driver) sent chained subsystem control blocks (SCBs) that had an invalid format or were invalid in their blocklengths or boundaries. Such cases indicate software bugs, so the SCSI-subsystem should be fine. For the moment, this SCSI-driver does not use chained SCBs, so this message should appear rarely if ever.
  • Command Error
      This is quite famous and appears always, if the driver sends junk to the SCSI-subsystem. This can be some impropper return bufferlength or just some SCSI-command, that the IBM SCSI-subsystem does not support.

IBM MCA SCSI: ldn=0x..., SCSI-device on (...,...) vanished!
When the SCSI-subsystem is booting, it stores information about every valid PUN/LUN combination on the SCSI-bus into a map. If you have more than 16 valid PUN/LUN combinations on your machine, dynamic reassignment of ldns gets active. If during such a reassignment, a previously found PUN/LUN is invalid, e.g. switched-off, this message is presented.

IBM MCA SCSI: cmd already in progress for this ldn.
If some command is queued to a certain SCSI-device, this device won't accept any further command to be queued with the IBM SCSI-adapters. If nevertheless, upper Linux tries to put more than one command in the queue, this message will be presented. That may be caused by a bug in the ibmmca.h file or in the midlevel SCSI-driver.

IBM MCA SCSI: scatter-gather list too long.

IBM SCSI-adapters are capable to read one command-input block not only as one single closed memory block, but also as a block of multiple memory segments. There may exist up to 16 of these scattered memory segments. If more segments are required, that exceeds the capability of the IBM SCSI-adapter hardware and you get this message. As the maximum scatter-list length is stored in the ibmmca.h file, this should never appear.

Content created and/or collected by:
Louis F. Ohland, Peter H. Wendt, David L. Beem, William R. Walsh, Tatsuo Sunagawa, Tomáš Slavotínek, Jim Shorney, Tim N. Clarke, Kevin Bowling, and many others.

Ardent Tool of Capitalism is maintained by Tomáš Slavotínek.
Last update: 08 Sep 2024 - Changelog | About | Legal & Contact