Sybase Business Intelligence Solutions - Database Management, Data Warehousing Software, Mobile Enterprise Applications and Messaging
Sybase Brand Color Bar
delete

Search for    in all of Sybase.com
view all search results right arrow
  blank
 
 
 
 
 
 
 
 
 
 
Support > Technical Documents > Document Types > Technote > Common Adaptive Server Enterprise 12.0 and 12.5 Is...

Common Adaptive Server Enterprise 12.0 and 12.5 Issues on Sun Solaris

Some issues have been identified with Adaptive Server Enterprise running on the Sun Solaris platform. This TechNote describes the problems, their symptoms and causes, workarounds where available, and solutions.
 
RSS Feed
 
 
 
 

Contents

  Background
  Issue 1: Solaris Floating Point Register Corruption
  Issue 2: ASE Stack Guardword Corruption
  Issue 3: Failed SIGALRM Signal Delivery
  Issue 4: Shared Memory Disappearing
  Summary

Background

Sybase Technical Support has identified some common problems when using Adaptive Server Enterprise 12.0 and 12.5 on Solaris.  These problems may be related to Sybase or Sun bugs, or they may be due to architectural factors. Some of these problems can severely impact Adaptive Server, causing server outages, hangs, and data corruption.

Below is a summary of the known issues, information about steps you can take to avoid these problems, and details on available and forthcoming Sun patches and ASE EBFs.

Issue #1: Solaris Floating Point Register Corruption

Solaris can have a problem in correctly restoring floating point registers. Symptoms of the problem include "Infected with 10 and/or 11" errors; there have also been reports of database corruption, specially 614 and 644 errors. The problem has been seen in the following situations:
  • On Sybase ASE versions 12.0 and 12.5 on Solaris
  • Under high disk i/o rates on high throughput storage configurations
  • On servers running large numbers of CPUs
Two separate Solaris issues have been identified which can cause these symptoms. The first issue is being tracked under:

        Sun Alert ID:  26588
        Sun Bug:        4439142

and the second issue under:

        Sun Alert ID:  45152
        Sun Bug:        4686943

For the first problem (Bug #4439142) the following patches are currently available. However it is recommended that you refer to the Sun alert for the latest patch requirements.
 

Solaris Version  Patch Number
2.5.1 103540-35 & 111442-01
or
103540-36 & 111444-01
2.6 105181-26 & 111435-01
or 
105181-28 & 111446-01
or
105181-29 
or a later patch
2.7 106541-16 & 111437-01
or 
106541-17 
or a later patch
2.8 108528-07 & 111459-01
 or 
 108528-08 & 111433-02
or 
108528-09 
or a later patch

For the second problem (Bug #4686943) you can find more information in Sun Alert 45152 and the following patches are available from Sun:
 

Solaris Version Patch Number
2.5.1 none available
2.6 105181-33
2.7 106541-23
2.8 108528-16
2.9 112233-02

Please note that it is very difficult and time-consuming to prove the existence of either of these floating-point issues. We  therefore recommend that you install the appropriate Sun patches to avoid any of the symptoms in ASE.

Issue #2: ASE Stack Guardword Corruption (ASE 12.0 only)

This problem results from Adaptive Server processes not obtaining sufficient memory (stack size). The symptoms of the problem are that the server aborts with a stack trace and the errorlog reports "Stack Guardword Corrupted".

This problem has been seen under the following conditions:

  • ASE 12.0 running on Solaris 2.6 or 2.8
  • Periods of high rates of disk i/o such as "update statistics"
The issue is being tracked under Sybase CR# 220929.

You can work around this problem by increasing the Adaptive Server stack size configuration to four times the default, from 45Kb to 180Kb.

The following Solaris patches are available (to ensure that you have the latest patch version, consult the Sunsolve web site):
 

Solaris Version Patch Number
2.6 latest Sun kernel jumbo patch
2.8 108528-08, jumbo kernel patch
108652-32, X11 patch
108875-09, c2 audit patch
108879-07, Solstice AdminSuite
108968-05, rmmount patch
108974-11, drivers patch
108975-04, format patch
108977-01, libsmedia patch
108985-02, rshd patch
109137-01, pkginstall patch
109299-02, Veritas File Systems multiple fixes patch
109320-02, LP patch
109783-01, nfsd patch
109951-01, jserver buffer overflow
103346-28, flashprom update
108827-10, libthread patch
108901-03, rpcmod
108827-20, /usr/lib/libthread.so.1 patch
111177-02, /usr/lib/lwp/libthread.so.1 patch

Note:
The Sun patch numbers in this document are those available at the time of publication, and may not be the latest applicable patches. To be certain of obtaining the latest patches, consult the Sunsolve web site.

Issue #3: Failed SIGALRM Signal Delivery

ASE's use of the Solaris T1 threading model to implement native thread libraries can result in two different alarm signal (SIGALRM) delivery problems in specific situations.  The problem is seen in Adaptive Server 12.0 and 12.5 (but not in 12.5.0.1 and higher) across all versions of Solaris 2.5 through 2.8 currently supported for ASE, and is most often seen when using file system devices.

Reported symptoms include:

  1. sp_sysmon dies with a "divide by zero" error in the engines section. This results from clock ticks being 0, since that field is only updated when an alarm signal is delivered.
  2. Adaptive Server hangs. It may clear without intervention, or may require intervention.
  3. Intermittent performance problems; the same query, with the same general load on the system, can take much longer to complete.
  4. Unresolved deadlocks.
  5. Inability to halt an engine or cleanly shut down ASE. One or more engines must be killed from the unix prompt.
  6. Rep agent "hangs".
  7. CIS connections hang (various issues).
  8. Dumps and loads may hang.
  9. Alarms, such as waitfor, hang forever.
These issues are being tracked under:

Sun Bug:        4435240
Sybase CR#:  255865

and

Sun Bug:        4498831
Sybase CR#:  267970

Note:
The fix for CR 255865 requires a change to ASE's threading model, and is included in ASE 12.0.0.3 ESD #4 and higher releases. To activate this fix in those versions of ASE, you will need to set run-time trace flags 1631 and 1633. Refer to the relevant ESD coverletters for details and complete instructions. Note also that Item #7 above is tracked as CR# 242849 or CR# 237024.

You can use the Solaris truss utility to help identify this problem. For each dataserver process, run:

truss -c -p <dataserver process id>

from the unix command line for a period of 30 seconds to one minute; end it by using CTRL-C. If the section of output under "signals" shows approximately 10 SIGALRMs for each second truss was run, then this engine is not seeing the failed SIGALRM delivery problem. If any engine has significantly less than 10 SIGALRMs then you are probably experiencing the signal problem.

The same symptoms are seen for both Solaris problems. It is not possible, without full system core dumps analyzed by Sun, to know which problem you may have encountered.

Recommended Solution Matrix

12.0 - 12.0.0.3 ESD#3 12.0.0.3 ESD#4 - 12.0.0.4 12.0.0.5-
12.0.0.7 (see note 2 below)
12.0.0.8 and later 12.5 12.5.0.1 and later
Solaris 2.6 No fix - upgrade either ASE or Solaris Use ASE traceflags 1631 and 1633 Use ASE traceflags 1631 and 1633 Use ASE traceflags 1631 and 1632 Not Supported Not Supported
Solaris 2.7 No fix - upgrade either ASE or Solaris Use ASE traceflags 1631 and 1633 Use ASE traceflags 1631 and 1633 Use ASE traceflags 1631 and 1632 Not Supported Not Supported
Solaris 8 before patch 108528-17 (see note 1 below) Use Solaris alternate thread library Use Solaris alternate thread library OR ASE traceflags 1631 and 1633 Use Solaris alternate thread library OR ASE traceflags 1631 and 1633 Use Solaris alternate thread library OR ASE traceflags 1631 and 1632 No workaround available No known issues
Solaris 8 with 108528-17 or later Use Solaris alternate thread library Use Solaris alternate thread library AND ASE traceflags 1631 and 1633 Use Solaris alternate thread library AND ASE traceflags 1631 and 1633 Use Solaris alternate thread library AND ASE traceflags 1631 and 1632 No known issues No known issues

Notes :

1) The second signal issue exists only on Solaris 8. It is caused by a problem in how Solaris implements Posix-compliant signal handling: under heavy workloads the operating system sets a flag saying that a signal is pending, but this flag is never reset, thus no new signals are issued to ASE. In 12.5.0.1 this was fixed using setitimers rather than posix timers, but since setitimers have known bugs in Solaris 2.6 and 2.7, and since ASE 12.0 is certified on those platforms, ASE 12.0 cannot use the setitimer approach. For further technical details on the bug (#4498831) see the description at http://sunsolve.sun.com (you may need a sunsolve login to view this information).

2) Trace flag 1632 was introduced in ASE 12.0.0.5 to help avoid some CIS performance issues reported when using trace flag 1633. However, a bug in the implementation of that fix can lead to loss of signals, particularly on very busy servers or servers with CPU contention (see CR 336858 for details). The fix for this problem has been implemented in 12.0.0.8. For releases between 12.0.0.5 and 12.0.0.7 inclusive, Sybase recommends using trace flags 1631 and 1633; if there is a CIS performance problem, upgrade to 12.0.0.8 and use trace flags 1631 and 1632.

Workarounds include:

   1. Use ONLY RAW devices for ALL Adaptive Server devices including tempdb. You may also be
       able to use Veritas Quick I/O. These options work because Solaris uses kernel threads to
       handle async I/O to raw devices, instead of the user threads (lightweight processes, or LWPs)
       used to handle async I/O to file system devices.

   This workaround applies regardless of the Solaris version used or the libthread version being
   implemented.

   2. For Solaris 2.8 *only*, add the following three lines to the start of the RUN_server-name file;

   LD_LIBRARY_PATH=/usr/lib/lwp:$LD_LIBRARY_PATH
   LD_LIBRARY_PATH_64=/usr/lib/lwp/64
   export LD_LIBRARY_PATH LD_LIBRARY_PATH_64

   Adding these lines ensures that you use an alternate version of the dynamic libthread library.

   This option works whether you are using a 32-bit or 64-bit ASE server.

Note:
Several sites have reported problems including engine core dumps and servers hanging on Remote I/O after implementing workaround #2. This appears to be due to Sun bug 4457358. Solaris patch 109384-02 must be installed to correct these problems if you choose this method.

Issue #4: Shared Memory Disappearing

There have been some cases recently where Adaptive Server's shared memory ID disappeared. Symptoms are as follows:
  • The problem occurs on Solaris 2.6, 2.7, and 2.8.
  • The server continues to work without any problems until shutdown.
  • On Solaris 2.7 and 2.8, but not 2.6, the server may core dump on shutdown.
  • While the server is running, attempts to attach to shared memory - for example, by a process such as sybmon - fail.
  • Executing the unix command ipcs -ma does not show the ASE dataserver's shared memory. In many cases, no shared memory ID is seen using this command.
This bug has been resolved under Sybase CR# 256710. The fix appears in version 12.0.0.4 ESD #2 and higher, as well as version 12.5.0.1 IR.

The problem appears to be caused by shell scripts starting Adaptive Server during system start-up.  If ASE gets a shared memory ID of 0, then Backup Server may remove that ID when it is run; this is CR# 256710 "Backup Server always (tries to) destroy shared memory segment 0."

To avoid this problem until a patched release is installed:

   1.  Do not start ASE from a shell script at system boot,

   OR

   2.  Run a program which allocates a shared memory segment before starting ASE.

Note:
CR #256710 applies only for configurations involving a local backup server. CR #333320 applies in the case of a remote backup server.

Summary

Sun has issued several significant fixes for Solaris in the last few months that can be helpful in resolving many of these issues. Customers encountering the above problems with ASE on Solaris are encouraged to install the latest Solaris patches available for their OS level. Visit the Sunsolve site for current information on the patches. Also visit the Sybase Technical Support site for updates on the Sybase CRs mentioned here.

 

DOCUMENT ATTRIBUTES
Last Revised: Jan 20, 2005
Product: Adaptive Server Enterprise
Technical Topics: Bug Information
  
Business or Technical: Technical
Content Id: 1016173
Infotype: Technote
 
 
 

© Copyright 2014, Sybase Inc. - v 7.6 Home / Contact Us / Help / Jobs / Legal / Privacy / Code of Ethics