Content by Michael Lang (original archived HERE). Edited by Major Tom.
Last content update: 09 Dec 2000
This page offers you background information in order to solve problems and
trouble with your IBM MCA SCSI-subsystem, the driver and Linux. Examples are
given and command line parameters are described, too.
No Reasons for Alert
Not everything that looks on the first stage like a problem or a severe error
is a real error. There exist several known troubles that may appear, but are
not entangled with the driver or MCA-Linux.
- When it boots, it hangs after the line IBM MCA SCSI: Scanning SCSI-devices. or up from version 3.2a ... cleared, .
When your driver is hanging here for some time during boottime or during
loading it as module, this is completely normal. Depending upon your adapter
type and the amount of SCSI devices and if you have multiple LUN probing
selected or not in the kernel configuration, it can take up to 2 minutes,
until the whole PUN/LUN space of a SCSI-bus is scanned for valid devices.
If a SCSI-device is not found at the first attempt, the adapter has to wait
for a certain timeout period, which is granted to all SCSI-devices by the
ANSI specifications. If until the end of that timeout period, nothing is
answering, there is no device present and that is, what the driver marks on
its internal PUN/LUN map.
- It hangs forever during boottime.
This can happen, if some of your SCSI-devices have failures. You should
check for this by booting with some othe operating system, like DOS or OS/2.
If it fails to boot with other OSes, too, it is obviously no software problem.
If other cases happen, please refer to the next section.
Debugging or Non-Debuging Mode
When booting Linux, there exist two major different ways, the driver could
show up during boottime. If debugging inside the driver code is activated,
you get quite verbose information about the different activities of the
driver during initialization. If you would like to get rid of the verbose
output, you need to change the line
#define IM_DEBUG_PROBE
into
#undef IM_DEBUG_PROBE
or vice versa, if you have troubles to activate the verbose debugging mode.
In debugging mode, a lot of useful information is given and this can help
you and the maintainer to get things right.
In the following paragraphs, text written in teletype style shows
what you see on the screen. Comments to the output are kept in default script.
The following paragraph will show you all important available messages. Some
of them are only shown while debugging mode is activated.
The ordinary Linux-Kernel-Distribution contains always a driver release,
where debugging is switched-off, so it shows already minimum output by
default.
IBM SCSI-Subsystem Driver Messages
IBM MCA SCSI: Version 3.2pre2
This is the first message you get from the driver and it shows the current
driver-version, that is running in your kernel. As example, 3.2pre2
is displayed. If you get the line
IBM MCA SCSI: No Microchannel-bus support present --> Aborting.
Enable MCA-Bus support in the Kernel-Config, first!
instead, the driver will not work, as no Microchannel-Bus support is activated
in the kernel. This must be activated, as the driver needs to access the
MCA-POS-registers of the SCSI-adapter(s).
IBM MCA SCSI: IBM SCSI Adapter w/Cache found in slot 1, io=0x3558, scsi id=7,
ROM Addr.=0xc0000, port-offset=0x18, subsystem=enabled.
is an example for a valid and detected IBM SCSI-adapter or controller.
When slot-cards are found, their ROM Address and the port-offset is shown.
The port-offset is essential and is added by the driver to the base 0x3540.
Via this port, every communication with the SCSI-adapter is done. In addition,
the id of the adapter is shown. As interrupt, the shared interrupt 14 is
used by default. This is not shown during boottime. Oncoming development
may make it necessary to allow selection for interrupt usage between 14 and
11, as interrupt 14 is default or fixed for all IBM SCSI-adapters, except of
the SCSI-2 Fast/Wide Adapter/A, which may use interrupt 11.
If the driver cannot assign the correct I/O region to the adapter, it is
possible, that there is a conflict in the hardware setup. For such a case
check the settings on your PS/2 using the reference disk. Such a
message that you may get could look like this:
IBM MCA SCSI: Unable to get I/O region 0x3540-0x3547 (8 ports).
If Linux refuses to register the SCSI-adapter, presented to Linux by the driver,
you will get the messge
IBM MCA SCSI: Unable to register host.
which appears, if there is a bug in the SCSI-host structure, that Linux does
not like. In such cases, your hardware should be fine, but drivercode/Linux
makes the trouble.
If the probing for
IBM SCSI-adapters fails at all, you do not have any IBM SCSI-subsystem
installed (or if you are sure, there exist still recognition problems from the
driver side). That is possible, as some PS/2s were delivered with Future Domain
or Adaptec 1640 controllers. In such a case, after some seconds you would get
the message:
IBM MCA SCSI: No IBM SCSI-subsystem adapter attached.
If an intergrated SCSI-controller is found, the port is fixed to 0x3540 and
the ROM Address is at 0xc0000. The chip-revision and some vendor-bit info
is shown instead.
IBM MCA SCSI: Control Register contents: 3, status: 4
The control-register contents could have various values during startup. This
is one way to check for adapter-health during boottime.
IBM MCA SCSI: This adapters' POS-registers: ff 8e 7 fc 22 ff ff ff
The POS-registers are 8 bytes, that contain information about the hardware
inserted into a certain slot on the Microchannel-Bus. This information is
of vital importance to probe for position and type of hardware on the bus.
The first two POS registers contain the type-unique adapter-id. This two-byte
expression describes the product type. Every different MCA-hardware must have
a different adapter-id. The IBM SCSI-adapter w/Cache has for example 0x8eff.
If you get the line
IBM MCA SCSI: Subsystem SCSI-commands get bypassed
it means, that your SCSI-commands get bypassed by the driver. There exist two
ways of sending commands to the SCSI-devices through the IBM adapters. The first
one is to use the adapter-integrated commands, that will work at maximum
possible performance, but are not recommended by IBM to be used on sequential
SCSI-devices. If you have problems with the driver or your adapter has some
problems, you may set the commandline-parameter bypass to force the
driver not to use any adapter-intelligency. So, built-in commands are not
used, even if it would make sense. If this message does not appear, that is
the default and your SCSI-subsystem operates at maximum performance, if it is
in fine health.
IBM MCA SCSI: Sync.-Transfer-Rate: 5.00 MHz, Timeout: 45 s
If you run IBM SCSI-subsystems at maximum speed, this line shows 5 MHz of
synchronous data-transfer-rate, except for SCSI-2 Fast/Wide Adapter/A, which
can cope with 10 MHz. The default timeout value is set to 45 seconds at boottime,
which is the factory default. This timeout applies to all SCSI-devices and the
adapter itself. If you use the commandline parameters fast, medium
or slow, you can make the adapter go slower than maximum speed. The
maximum synchronous data-transfer-rate is the default. The adapter negotiates synchronous
data-transfer on a per-command-base.
IBM MCA SCSI: Current SCSI-host index: 0
The driver is capable to handle up to 8 different IBM SCSI-adapters on the
Microchannel-Bus in one single PS/2 system. If you have multiple adapters,
they get different host-indices, that is from 0 to 7.
IBM MCA SCSI: Removing default logical SCSI-device mapping................
Linux likes to know, at which physical unit number (PUN) and at which
logical unit number (LUN) of a physical unit, SCSI-access is possible in
order to be enabled to send commands there and to make the bootup
configuration easy. IBM SCSI-subsystems operate with some internal device
recognition, wich hides this PUN/LUN map of the SCSI-bus completely to Linux.
When power is switched on (during POST), the SCSI-adapters probe on their own
for devices on the SCSI-bus and assign so called logical device numbers (ldn)
to all valid found PUN/LUN combinations up to 15. 16 ldns are available, where
the highest ldn is always used for the adapter itself. A solution to this
problem is to remove all pre-assigned ldns again and to do the PUN/LUN
probing by feet, where the driver controls the process and creates a map of
the PUN/LUNs on the SCSI-bus, which it presents to Linux and which corresponds
to the physical reality on the SCSI-bus. This mapping is done in three steps.
The line above is shown, while the ldn-setting gets deleted. Every ldn makes
a (.) on the screen, when debugging is activated.
IBM MCA SCSI: Probing SCSI-devices......
This line now checks through all possible PUN/LUN combinations along the SCSI-bus.
If a PUN/LUN combination is valid, it is added to the physical SCSI-bus map
inside the driver-memory. Every (.) is printed in debugging mode for a checked
PUN/LUN combination.
IBM MCA SCSI: Mapping SCSI-devices......
This line is shown, when the 15(16) available ldns are distributed among the
valid PUN/LUN combinations. If there are more than 16 valid PUN/LUN combinations,
dynamical reassignment of the ldns during runtime is necessary to have full access
to all SCSI-devices. On the first stage, the first 8 ldns are given to PUN=0..7 and LUN=0.
The remaining ldns are filled for PUN=0..7 for LUN=1 and so on, if no device has
valid LUNs greater than 0. If there are LUNs greater than 0, the remaining ldns
get accumulatd on them first, before beeing spread somewhere to some invalid
PUN/LUN. During runtime, no ldn is left free to prevent any
intervention of the SCSI-adapters intelligency.
IBM MCA SCSI: Device order: New Industry Standard (pun=0 is first).
The remapping of SCSI-devices inside the driver allows the possibility to
inverse the drive order presented to Linux. It makes no difference, if
some device is known as PUN=6 instead of PUN=1, as long as it is kept in that
way consequently. Here, it is possible to adapt the philosophy of the user.
In the ANSI-SCSI-specs, the devices with the highest PUN-value get the
highest priority on the SCSI-bus. These should be the harddisks. Today, some
new industrial adapters and products think, that the lowest PUN-value should
have the highest priority. To stop this endless discussion on what seems to
be the better method for this driver, there are two ways, that can be chosen
by the user, either while compiling the kernel or as commandline. When you
put the commandline-parameter normal you will get the new industrial
standard as shown by the above line, where PUN=0 is the first device on the
bus to be checked by the operating system. Other OSs would call this drive
C:. If you place ansi instead, the correct specs are kept
and PUN=7 is the first device to be scanned. In such a case you will get the
line:
IBM MCA SCSI: Device order: IBM/ANSI (pun=7 is first).
After having checked all possible combinations of PUN/LUNs and setting up the
driver upon the desires of the hardware and the user, the following maps are
shown, if debugging mode is enabled:
IBM MCA SCSI: Determined SCSI-device-mapping:
Physical SCSI-Device Map Logical SCSI-Device Map
ID\LUN 0 1 2 3 4 5 6 7 ID\LUN 0 1 2 3 4 5 6 7
0 T + + + + + + + 0 0 7 e - - - - -
1 - - - - - - - - 1 1 8 - - - - - -
2 R + + + + + + + 2 2 9 - - - - - -
3 - - - - - - - - 3 3 a - - - - - -
4 D + + + + + + + 4 4 b - - - - - -
5 D + + + + + + + 5 5 c - - - - - -
6 D + + + + + + + 6 6 d - - - - - -
7 A A A A A A A A 7 f f f f f f f f
The contents of these tables may vary upon the devices installed on your
SCSI-bus. The shown maps are an example for having a tapedrive at
PUN=0, a CD-ROM at PUN=2 and three harddisks at PUN=4..6.
The left table contains the physical SCSI-device mapping connected to the
IBM SCSI-subsystem. The devicetype is checked for later command handling.
The letters shown report the detected devicetypes and have the following
meaning:
Letter | Description |
A | PUN/LUN combinations occupied/reserved by the IBM SCSI-subsystem |
D | Valid harddisk (random access) |
T | Valid tapedrive (sequential access) |
P | Valid SCSI-processor device |
W | Valid WORM-CD-writer/reader |
R | Valid ordinary CD-ROM drive |
S | Valid SCSI-Scanner |
M | Valid Magneto-Optical (MO) drive |
C | Valid Medium Changer / Jukebox / Syquest-drive |
+ | PUN/LUN combination occupied by a SCSI-device, but not provided |
- | PUN/LUN combination without any connected device |
The right table shows how the ldns got distributed among the valid PUN/LUN
combinations. The logical map can change during runtime, when there are more
than 15(16) valid PUN/LUN combinations connected to the SCSI-bus due to
dynamic reassignment.
Therefore, /proc/scsi/ibmmca/N contains the always actual maps.
Kernel Panics
The most popular panic that appears is the message
IBM MCA SCSI: Fatal errormessage from the subsystem (0x...,0x...)!
This message appears always, when something is going wrong with the SCSI-driver
or the IBM SCSI-subsystem. Up to now, most reported problems came originally
from the driver and were based on the fact, that the SCSI-subsystem got bad
input-data from the driver. These bugs should be more or less
history now and this panic focuses back to the old function to indicate, if
bad things happened while the SCSI-subsystem processed some command.
There exist three possible failures, that may be recognized by the
SCSI-subsystem hardware. These are:
- Hardware Failure
The IBM SCSI-subsystem has detected an internal error in the hardware.
This kind of error is in principal the death scream of your SCSI-adapter.
When such error appears, the only thing you can check is to see if the
adapter-card is well inserted into the MCA-slot and if the CMOS-battery of
the mainboard is still ok. If both seems to be ok, it could still be an
error in the driver, but this is very improbable, as these bugs always get
reported as Command Error.
- Software Sequencing Error
The software (driver) sent chained subsystem control blocks (SCBs) that
had an invalid format or were invalid in their blocklengths or boundaries.
Such cases indicate software bugs, so the SCSI-subsystem should be fine.
For the moment, this SCSI-driver does not use chained SCBs, so this message
should appear rarely if ever.
- Command Error
This is quite famous and appears always, if the driver sends junk to the
SCSI-subsystem. This can be some impropper return bufferlength or just some
SCSI-command, that the IBM SCSI-subsystem does not support.
IBM MCA SCSI: ldn=0x..., SCSI-device on (...,...) vanished!
When the SCSI-subsystem is booting, it stores information about every
valid PUN/LUN combination on the SCSI-bus into a map. If you have more than
16 valid PUN/LUN combinations on your machine, dynamic reassignment of ldns
gets active. If during such a reassignment, a previously found PUN/LUN is
invalid, e.g. switched-off, this message is presented.
IBM MCA SCSI: cmd already in progress for this ldn.
If some command is queued to a certain SCSI-device, this device won't accept
any further command to be queued with the IBM SCSI-adapters. If nevertheless,
upper Linux tries to put more than one command in the queue, this message will
be presented. That may be caused by a bug in the ibmmca.h file or in
the midlevel SCSI-driver.
IBM MCA SCSI: scatter-gather list too long.
IBM SCSI-adapters are capable to read one command-input block not only as one
single closed memory block, but also as a block of multiple memory segments.
There may exist up to 16 of these scattered memory segments. If more
segments are required, that exceeds the capability of the IBM SCSI-adapter
hardware and you get this message. As the maximum scatter-list length is
stored in the ibmmca.h file, this should never appear.
|