![]() |
|
Supplement to the Sybase IQ User-Defined Functions GuideAbstract: This paper provides a basic overview of the implementation of User-Defined Functions (UDFs) for Sybase IQ, available starting with release 15.1. It is meant to be a supplement to the Sybase IQ 15.1 User-Defined Functions Guide, which is provided as part of the product documentation set for Sybase IQ 15.1 and higher releases. Audience: This paper is intended for developers of UDFs, as well as administrators responsible for installing and configuring UDFs on a Sybase IQ instance. Author: Jay Mueller, Sybase Partner Certification Group, Application Engineering. Additional information provided by Sybase IQ Engineering. ContentsREVISIONS and REFERENCED DOCUMENTS TYPES OF USER-DEFINED FUNCTIONS PLANNING AND DESIGN IMPLEMENTATION - Basic Workflow SELECTED SQL CALL-BACK FUNCTIONS DECLARING A SQL FUNCTION TO ACCESS THE UDF INSTALLATION, UPGRADE AND UNINSTALLATION RUNTIME CONFIGURATION AND TESTING Skip forward to OVERVIEW Revisions:
Referenced Documents:Note: Occasionally, maintenance at Sybase.com will cause documents to be moved to different URLs. If the direct links in this Best Practices Guide fail to work, navigate to the referenced document and search for the relevant section within the document. OverviewThe purpose of this User-Defined Function (UDF) Development Best Practices Guide is to supplement the Sybase IQ User-Defined Functions Guide, by giving the developers of UDFs some general guidelines on how to get started, some design considerations, etc. There are several references from this Best Practices Guide back to the User-Defined Functions Guide, where additional detailed information is available. With release 15.1 of Sybase IQ, and release 3 of RAP - The Trading Edition™, an optional feature was added which enables partners to create User Defined Functions (hereafter referred to as “UDFs”) that execute within the database container. The UDF execution feature is available as an optionally-licenseable component for use within Sybase IQ or within the RAPStore component of RAP - The Trading Edition™ R3. For the remainder of this report, wherever Sybase IQ 15.1 is mentioned, the statement also applies to the RAPStore component of RAP - The Trading Edition™ R3. In order to execute external UDFs, the Sybase IQ database server must be started with the IQ_UDF license (see UDF Licensing below for details). Also available with RAP - The Trading Edition™ R3: Enterprise is a set of optional time-series functions which were implemented with this same UDF architecture. These functions are named with a 'ts_' prefix. These finance-specific functions are available only with RAP - The Trading Edition™ R3: Enterprise, and are documented in the Sybase IQ 15.1 Reference: Building Blocks, Tables, and Procedures, in the section titled Time series and forecasting functions. In order to execute the TimeSeries functions, the Sybase IQ database server must be started with the IQ_TS_FSF license option in addition to the IQ_UDF license option (these are both included in the license for RAP The Trading Edition R3 Enterprise - see UDF Licensing below for details). The purpose of developing and invoking UDFs that execute within Sybase IQ is to take advantage of the extreme performance of the IQ server, while also providing users the flexibility of analyzing their data with the flexibility of a programmatic solution. User-Defined Functions comprise two components.
A UDF is declared in the SQL environment via the creation of a SQL function or stored procedure which describes the parameters and provides a reference to the external library. The actual executable portion of the UDF is contained within an external (shared object or dynamic load) library file, which will be automatically loaded by the IQ server upon the first invocation of a UDF Declaration function or stored procedure associated with that library. Once loaded, the library will then remain resident in the IQ server for rapid access via subsequent invocations of SQL functions or stored procedures that reference that library. This architecture is represented in a diagram below. The mechanism for developing UDFs is via C or C++ coding. The C/C++ source code for the UDFs is compiled into one or more external libraries that are subsequently loaded into the IQ server's process space when needed. The UDF calling mechanism is defined to the Sybase IQ server via a SQL function. When the SQL function is invoked from a SQL query, the IQ server will load the corresponding library if it has not already been loaded. For simplicity of managing the UDF installation, it is recommended that UDF developers package many UDF functions within a single library To facilitate the construction of UDFs, a C-based API has been developed and provided by Sybase IQ. The API comprises a set of pre-defined entry points for the UDFs, a well-defined context data structure, and a series of SQL call-back functions that provide a communication mechanism from the UDF back to the Sybase IQ server. The Sybase IQ UDF API allows software vendors and expert end-users to develop their own UDFs. Software vendors may package and sell UDFs independently from Sybase IQ releases. A User-Defined Functions Guide has been delivered along with release 15.1 and later releases of Sybase IQ . The guide describes the UDF API, gives several examples of calling patterns for the functions, and details the operation of the data structure and call-back functions. Types of UDFsThere are several types of user-defined functions, depending on the following attributes: Cardinality of Input - Number of Input Rows per function call (Scalar vs. Aggregate)The number of input rows against which the function will operate will determine the classification of the UDF and how the underlying function code will be implemented. Scalar User-Defined FunctionsFunctions that operate against a single row of input are refered to as Scalar User-Defined Functions, also known simply as User-Defined Functions or UDFs. Unless otherwise indicated, a User-Defined Function is assumed to be scalar. Since scalar UDFs operate on a single row of input, all values are passed into the underlying function code at invocation time. Hence, no contextual information is stored between invocations of the underlying code associated with a scalar UDF. Example scalar (native) function invocation: select UCASE(DepartmentName) from Departments Example scalar UDF (sample UDF delivered with Sybase IQ): select MY_PLUS_COUNTER(DepartmentID) from Departments User-Defined Aggregate Functions (UDAFs)Functions that operate against a group of rows of input are referred to as Aggregate User-Defined Functions, also known as User-Defined Aggregate Functions, UDAFs or UDAs. Aggregate UDFs should be specifically referred to as such, since the default assumption for UDFs is that they are scalar. For example, the Sybase IQ User-Defined Functions Guide refers to these functions as UDAFs. The context structure for coding UDAFs is slightly different than the context structure used for coding scalar UDFs. Since a UDAF needs to collect data from multiple rows, the intermediate results will need to be stored in a context area. For more details about managing the context storage, see the section Context Storage for Aggregate UDFs. Example aggregate (native) function invocation: select MAX(DepartmentID) from Departments Example UDAF function (sample UDAF delivered with Sybase IQ): select MY_SUM(DepartmentID) from Departments Cardinality of Output - Number of Output values per function call (single output vs. multiple outputs)The number of output values to be generated by each invocation of the function will determine how the underlying function code will be implemented, as well as having implications on how users will invoke the UDF. Single-output UDFsWith traditional SQL functions, only a single output was generated upon each invocation of a function. Invoking a traditional, single-output scalar function on each row in a set of rows would generate a single output value for each row. For example, the native scalar function 'UCASE' returns a single, upper-case result from each line of mixed-case input: select UCASE(DepartmentName) from Departments Invoking a traditional, single-output aggregate function on a group of rows will generate a single output value for the set of input rows. For example, the native aggregate function MAX returns the largest numerical value from a list: select MAX(DepartmentID) from Departments Multiple-output UDFsWith UDFs, it is possible to develop functions that generate multiple output values per set of input data. Invoking a scalar UDF on each row in a set of rows could generate multiple output values for each row. For example, a multiple-output scalar UDF could be written to simultaneously compute the minimum and maximum values from a set of columns in a single row. A scalar UDF that produces multiple outputs could be referred to as a "UDF-MO". Unless otherwise specified, a scalar UDF can be assumed to only produce a single output. Invoking a UDAF on a group of rows could generate multiple output values for the set of input rows. For example, a multiple-output UDAF could be written to simultaneously compute the count, sum and average of a column of numerical values. A UDAF that produces multiple outputs could be referred to as a "UDAF-MO". Unless otherwise specified, a UDAF can be assumed to only produce a single output. Summary of the different types of UDFsGiven the two classes of input row cardinality and two classes of output result cardinality, there are four basic types of UDFs. The following diagram illustrates these different types of UDFs.
Note that any type of UDF could operate either on a single column from the input row(s), or on multiple columns within the input row(s). See the Other Attributes of User-Defined Functions section below for more details. Implementation Choices for Multiple-Output UDFs (UDF-MOs and UDAF-MOs)When multiple outputs are desired from a UDF or UDAF, there are a couple different ways of implementing them. There are advantages and disadvantages of these implementations. Multiple analysis functions (each having a single output) The simplest solution (albeit not the most efficient) would be to develop multiple functions that each produce a single-output. Single-output functions fit nicely into traditional SQL environments, and are relatively straightforward to implement. For the scalar example above, this could translate into two single-output UDF invocations:
For large amounts of data that require extensive analysis, this approach is not the most efficient (consider the possibility that there are 1000 columns in the table, or a scenario where more complex analysis is being performed). Single analysis, multiple accessors (using an intermediate variable) An alternate solution would be to develop a single 'analysis' function which performs the analysis once, and packages the multiple results into a single object, such as a binary or varchar variable. This would then be coupled with one or more result 'accessor' functions, which would be used to parse the binary or varchar variable and return the desired results. For the scalar example above, this would translate into a UDF-MO invocation having a pattern similar to the following: -- Set up a variable to store intermediate result object In the above example, the analysis UDF-MO ('my_minmax_compute') would return a single object (@mm_result_string), a string representing two float values appended together (for example: '123456.7987654.3'). The accessor single-output UDF ('my_minmax_value') would take the result object as one parameter, a request for a value as another parameter, and then return the designated result. For the simple example above, the 'my_minmax_value' functionality could also be implemented using a native substring function surrounded by a native cast operation. For example, using native functions, the results could also be retrieved as follows (non-UDF solution for retrieving values from a results object): ... Either method should result in output similar to the following:
The resulting values may not match the result string exactly due to the choice of intermediate result variable type and/or internal rounding caused by the cast operation. For this example, a binary variable or longer varchar (padded with zeros) would probably be a better choice for the intermediate result values, where final results are float values. Note that as of release 15.1 GA of Sybase IQ, there was a limit of 256 bytes for return values of UDFs and UDAFs. With release 15.1 ESD1, this limit has been increased to 32K. In a future release of Sybase IQ, this limit will be expanded to 2MB. Eventually, it will be possible to use binary large objects (BLOBs) or character large objects (CLOBs) as return values from UDFs and UDAFs. This will allow for much larger result objects (limited only by the architecture of the hardware). Since there was a strict limitation on the size of the return values in 15.1 GA, a few UDAF-MOs were developed which wrote intermediate results to a temporary output file. These functions will be redesigned with the intermediate variable paradigm in the near future. Going forward, UDF-MOs and UDAF-MOs should NOT write to output files (see the section Practices to Avoid below). Other attributes of User-Defined FunctionsIn addition to the cardinality of input and output, there are other attributes that could be considered to distinguish the functions. These attributes are not as significant as the cardinality attributes described above, but interesting nonetheless. Number of Input Columns / Parameters per Row of InputUDFs and UDAFs (as well as UDF-MOs and UDAF-MOs) may operate on a single input column or multiple input columns. For example, my_sum is a UDAF that operates on a single column of input data. The sample UDF my_plus is an example of a scalar UDF that operates on multiple input columns within a single input row. An example of a UDAF that operates on multilple columns could be a correlation function that compares two columns of input data. Multiple-column input values are specified on the SQL command line as a comma-delimited list. The underlying code is then created to handle these multiple parameters. See the example code for the scalar UDF function 'my_sum' to see how this is implemented. The same convention applies to multiple-column inputs to UDAFs. Other than parameter-parsing, the number of input columns does not affect the inherent semantics of the underlying UDF code, so there is no need to make a distiction between single-column input and multiple-column input for the various types of UDFs. Deterministic vs. Non-deterministicAnother attribute is whether the function returns a deterministic value, or a non-deterministic value. The result of a function could either be determined solely by the input parameters and data (deterministic), or by some random behavior (non-deterministic). Parameters of non-deterministic UDFs typically need a random seed as one of the input parameters. An example of a non-deterministic function would be a normal bell curve distribution, with parameters specifying the shape of the bell curve. Note: Since non-deterministic functions require an additional random seed, if the random seed is set up to be the first parameter of the function, than the additional parameters that represent the shape of the distribution will not map as you may expect. In other words, if the parameters of a non-deterministic function are (RandomNumber, A, B, C, ...), then parameter 2 maps to A, parameter 3 maps to B, etc. Return values, Null handling, Aggregation allowed, SQL Security.UDFs can be defined to return a variety of data types in the result. Another attribute of a scalar UDF defines whether to ignore or respect null values. Additional attributes of UDAFs define what to return on empty input (null or a value) and whether and how the function is designed to operate within various OLAP aggregation windows. For an in-depth discusion about OLAP (online analytical processing), refer to Chapter 2: Using OLAP, in the Sybase IQ 15.1 System Administration Guide: Volume 2. SQL Security can also be defined for each type of UDF. With SQL Security, the UDF can be defined to run as the owner (DEFINER) or the user that is executing the query (INVOKER). For more information about these attributes, see the sections Declaring a Scalar UDF (CREATE FUNCTION statement) or Declaring a UDAF (CREATE AGGREGATE FUNCTION statement) in the Sybase IQ User-Defined Functions Guide. UDF LicensingIn order to execute UDFs residing in external libraries, the Sybase IQ server must be running with the "IQ_UDF" license option (the "IQ_TS_FSF" license option is also required for executing the TimeSeries UDFs). Determining if the license is presentTo determine if the appropriate license has been obtained, view the Sybase IQ Message Log (filename with extension '.iqmsg') co-located with the database (.db) file and configuration (.cfg) file. To verify the ability to execute UDFs, look for the string "Checked out license for {n} IQ_UDF" (where {n} is the number of CPU cores on the machine where Sybase IQ is running). This message is written to the file during database startup, so if it is present, it should be located near the top of the iqmsg file. To verify the ability to execute TimeSeries functions, look for the string "Checked out license for {n} IQ_TS_FSF" (where {n} is the number of CPU cores on the machine where Sybase IQ is running). This message is written to the file during database startup, so if it is present, it should be located near the top of the iqmsg file. The database may be located in a variety of places. For example, by default, the demo database is located at:
Alternately, examine the license file or files (.lic extension) in your environment. The license file(s) will be either co-located with your database (.db) file and configuration (.cfg) file (see above), or will be located in the Sybase Software Asset Management (SySAM) folder. Depending on the operating system flavor, the SySAM folder is located at:
In the license (.lic) file, look for the "IQ_UDF" option in the list of "COMPONENTS=". Obtaining the UDF licenseThe IQ_UDF license is available from the Sybase Product Download Center (SPDC) as the "Sybase IQ 15.1 Partner Solutions Option". Trouble-shooting - license is present, but not recognized during DB startupThere is a known issue with Sybase IQ regarding the way databases are created and how they are initially configured to look for license files. The database-creation script 'mkiqdemo' creates a License Management Protocol (.lmp) file along with the new database. This file will have the same name as the database instance (e.g. 'iqdemo.lmp'), and will be co-located with the database (.db) file and configuration (.cfg) file. When first generated, the first two lines of this file contain the strings "PE=" and "LT=". The string "PE=" tells the Sybase IQ server which Platform Edition license to look for, and is initially blank. The string "LT=" tells the Sybase IQ server which License Type to look for, and is also initially blank. These two fields must be edited in the .lmp file in order to tell the Sybase IQ server how to find the available licenses. See the next paragraph to determine how these two fields must be modified. When downloaded from the SPDC, the "Partner Solutions Option" license will contain a string "VENDOR_STRING=PE=xx;LT=yy", where 'xx' is the Platform Edition and 'yy' is the License Type for this license. These are the values that must be present in the .lmp file. Planning and DesignSybase IQ is a world-class, high-speed, robust database and analytics engine. Any components such as UDFs that are added to Sybase IQ should maintain these qualities of the Sybase IQ server. The following sections describe some details of these considerations. As with any modern programming project, there are some basic considerations to keep in mind while developing UDFs. In addition to the standard considerations related to software development in general, there are some additional considerations when working within a database container. For example, concurrency, availability and data integrity. See the section Considerations for Working within the Sybase IQ Container below for a more complete list and additional details. When executing within a database container, there are certain activities that should be avoided. See the section Practices to Avoid below for details. For additional considerations regarding UDF development, refer to the section User-defined function restrictions in the User-Defined Functions Guide. There are several flavors of UDFs that can be created. For example, scalar vs. aggregate, single-output vs. multiple-output, determinate vs. non-determinate. See the section Types of UDFs for a brief description of these different flavors of UDFs. After reading and understanding the considerations, practices to avoid and learning about the types of UDFs, the section Design - Basics describes the sample UDF source code provided along with Sybase IQ, and gives a high-level description of some of the components within the sample UDF source code. Considerations for Working within the Sybase IQ ContainerSome of the additional considerations for working within the Sybase IQ database container are listed below.
RobustnessThe reliability required by a database is extremely high. UDFs running within a database environment must maintain this high level of reliability. With the first implementation of the UDF API, UDFs run within the Sybase IQ server. If a UDF aborts prematurely or unexpectedly, this may cause the Sybase IQ server to also abort. It is important to ensure that UDFs do not terminate prematurely or abort unexpectedly under any circumstances. Data IntegrityIt is extremely important that a database maintains data integrity. In no circumstances should data be lost, modified, augmented, or corrupted. Since UDF execution happens within the Sybase IQ server, there is a risk of corrupting data. It is important to practice caution with memory management and any other use of pointers. A recommendation has been made for users to install and execute UDFs within a read-only multiplex node, but that policy cannot be guaranteed. For added protection each IQ server can enable or disable the execution of UDFs, via a startup option. See the section Enabling and disabling user-defined functions in the User-Defined Functions Guide. Flexible InstallationThe Sybase IQ product can be installed in a wide variety of configurations. UDFs must be easily installed within this environment, and must be able to run within all supported configurations. The Sybase IQ installer provides a default installation directory , but allows users to select a different installation directory if they so choose. In order to complement Sybase IQ, UDF developers should consider providing the same flexibility with the installation of the UDF libraries and associated SQL function definition scripts. Seemless ExecutionUDFs will be expected to run seemlessly within the database container. Although Sybase IQ is a complex product consisting of many files, the main user interaction is via an IQ server process (iqsrv15), using industry-standard SQL (Structured Query Language). Execution of UDFs should be accomplished entirely via SQL commands, and the user should not be required to understand the underlying implementation method in order to use the UDFs. UDFs run under the cover of Sybase IQ, so no console messages should be written. Any feedback to the user should be presented via pre-defined exception messages. UDFs should manage memory and temporary results as defined by the Sybase IQ UDF API. For additional details on the Sybase IQ UDF API, refer to the User-Defined Functions Guide. Sybase IQ manages disk I/O in a reliable manner to guarantee data availability and integrity. UDFs should generally not write to or read from the file system. Sybase IQ is a multi-user application. Many users could be executing the same UDF simultaneously. Certain OLAP queries will cause a UDF to be executed multiple times within the same query, sometimes in parallel. For additional details on setting UDFs to be able to run in parallel, refer to the section Aggregate UDF calling patterns in the User-Defined Functions Guide. Rapid UpgradeMany Sybase IQ installations are running in mission-critical environments. Customers require an extremely high level of availability from Sybase IQ. To provide the highest availability, it is important that system administrators be able to install and upgrade UDFs with little or no impact to the Sybase IQ server. Upgrades to Sybase IQ or underlying components (such as UDFs) should be accomplished with little or no downtime to the IQ server. ServiceabilitySybase IQ provides extensive logging and tracing capabilities. In the unlikely event that there is a problem with Sybase IQ, system administrators have access to this very detailed logging and tracing information. It is important that UDFs provide the same or better level of detailed logging, in the event that trouble-shooting indicates that a problem was encountered within the UDF code. User ControlSybase IQ can handle extremely large data sets, and some queries can potentially run for long periods of time. Occasionally a query will take an unusually long time to execute. The SQL client provides the user with the ability to cancel a query if it is taking too long to complete. Native functions provided by Sybase IQ are implemented in a manner that allows them to know when a user has canceled a query. UDFs must also be written in a manner that they understand if a query has been canceled by the user. In other words, UDFs should support the ability for users to cancel long-running queries that invoke UDFs. For details on this implementation, see the 'get_is_canceled' function description in the list of SQL callback functions below. InternationalizationSybase IQ has been internationalized, so that it can be sold in different countries around the world, to users who speak many different languages. Error messages have been extracted from the code and put into external files. This provides the capability to localize those error messages to new languages, without having to make extensive code changes. In order to support multiple languages, UDFs should also be internationalized. In general, most UDFs will operate on numeric data. In some cases, a UDF may accept string keywords as one or more of the parameters. These keywords should be placed in external files, in addition to any exception text and log messages used by the UDF. Sybase IQ has also been localized to a few non-English foreign languages. In order to support localization to the same languages as Sybase IQ supports, it is recommended that UDFs be internationalized to facilitate the localization process. Localization may be performed at a later date, by an independent organization. For details about international language support in Sybase IQ, refer to the Sybase IQ System Administration Guide, Volume 1, Chapter 11: International Languages and Character Sets There is also a white paper available at Sybase.com which describes how to debug multi-language client-server environments. This white paper is titled Debugging Using Cross-Character-Set Maps. This paper describes mainly the task of dealing with multi-byte data, as opposed to input keywords, exception messages and log entries. Most UDFs will deal with numeric data, as opposed to character data. Platform DifferencesBe aware that the functions will need to be flexible to run on a variety of platforms supported by Sybase IQ. The Sybase IQ 15.x server runs on 64-bit architectures. It is supported under several flavors of the MS Windows (64-bit) family of operating systems. It is also supported on various types and versions of Unix (64-bit), including Solaris, HP-UX, AIX and Linux. Customizable Installation LocationSee Flexible Installation above. Practices to AvoidThe following is a list of practices to avoid.
UDF Naming ConventionsUDF names must follow the same restrictions as other identifiers in Sybase IQ. Sybase IQ identifiers have a maximum length of 128 bytes. For simplicity of use, UDF names should start with an alphabetic character as defined by Sybase IQ. Alphabetic characters as defined by Sybase IQ are the letters of the alphabet, plus underscore (_), at sign (@), number or pound sign (#) and dollar sign ($). UDF names should consist entirely of these alphabetic characters as well as digits (the numbers 0 through 9). UDF names should not conflict with SQL reserved words. There is a list of SQL reserved words in the Sybase IQ 15.1 manual Reference: Building Blocks, Tables, and Procedures, Chapter 2: SQL Language Elements, section Reserved words. Although UDF names (as other identifiers) may also contain reserved words, spaces, characters other than those listed above, and may start with a non-alphabetic character, this is not recommended. If UDF names have any of these characteristics, then they must be enclosed in quotes or square brackets, which makes it more difficult to use them. A list of reserved words in Sybase IQ can be found on the page titled Reserved words under the section Keywords in Chapter 2: SQL Language Elements of the Sybase IQ manual Reference: Building Blocks, Tables, and Procedures. The direct link to the Reserved words section is http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc38151.1510/html/iqrefbb/Alhakeywords.htm The UDFs will reside in the same name space as other SQL functions and stored procedures. In order to avoid conflicts with existing stored procedures and functions, UDFs should be prefaced with a unique short (2-letter to 5-letter) acronym and underscore. UDF names should be chosen that will not conflict with other SQL functions or stored procedures already defined in the local environment. The following are some of the prefixes which are already in use by native Sybase IQ, the optional Time Series financial functions, and UDFs available from other vendors:
* For a list of DB Lytix functions, see the Partner Certification Report: Fuzzy Logix, LLC - DB Lytix version 1.0 User-Defined Functions (UDFs) running on Sybase IQ 15.1 with the IQ_UDF license option, Report 321, August 2009. For additional details on the DB Lytix functions, refer to DB Lytix product documentation or contact Fuzzy Logix, LLC (see Vendor Contact Information below). ** For a list of the financial time series analysis functions, see the Sybase IQ 15.1 manual Reference: Building Blocks, Tables, and Procedures, Chapter 4: SQL Functions, section Time series and forecasting functions. For additional details on these functions, refer to the section Alphabetical list of functions within the same document. The time series and forecasting capability is only available with RAP - The Trading Edition™ R3 Enterprise. Design – BasicsAs with any modern programming project, there are some basic considerations to keep in mind while developing UDFs. This Best Practices Guide assumes that the UDF developer is familiar with the basics of developing shrink-wrap software, including good program design and development, independent testing, etc. In addition to standard software development practices, the UDF developer should keep in mind that they are developing code to be executed within the Sybase IQ database container (see Considerations for Working within the Sybase IQ Database Container above). Also, it is important to understand the limitations imposed by the database container (see the section Practices to Avoid). Developers of aggregate UDFs should also be familiar with OLAP queries, and how they translate into UDF calling patterns (see the section Aggregate UDF calling patterns in the User-Defined Functions Guide for various OLAP examples, and their associated calling patterns). Because the UDFs may be invoked by several threads simultaneously, they must be constructed to be thread-safe. Sample codeStarting with Sybase IQ 15.1 and subsequent releases, sample UDF source code is delivered along with the product. As this code may evolve slightly from release to release, the newest version of the sample code will be that delivered with the most recent release. To see if there were last-minute changes to the sample UDF source code, refer to the Sybase IQ Release Bulletin for the relevant release and operating system platform. For Sybase IQ 15.1, the Release Bulletins for various supported platforms are located at: http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.help.iq.15.1/title.htm It should be noted that the sample UDF code documented in the User-Defined Functions Guide may not be the latest version as delivered with the Sybase IQ product. On Unix platforms, the sample UDF code is delivered at: $SYBASE/IQ-15_1/samples/udf On Windows platforms, the sample UDF code is delivered at: C:\Documents and Settings\All Users\SybaseIQ\samples\udf Note: The sample code for the 'my_interpolate' function has several variables named with the prefix 'preceeding_'. Although the code will compile and execute properly, these variables should be renamed to 'preceding_' to correct the spelling. SQL function definitions and associated SQL queries that employ window framing should use the correct spelling of the keyword 'PRECEDING'. Implementation - Basic WorkflowThis section describes the components of a UDF program, and how the program should be built. There is also another overview of this process in the section Sybase IQ overview in the User-Defined Functions Guide. The UDF executable code must be developed in a 64-bit environment. The Sybase IQ 15.x server runs in 64-bit environments only. The basic steps to developing a UDF are as follows:
The next section lists the various SQL call-back functions available for use within UDF code. Setting the dynamic library interfaceFor Sybase IQ UDFs, only version 3 interface style (EXTFN_V3_API) is supported. See the section Setting the dynamic library interface in the User-Defined Functions Guide for the C syntax to define this interface. The UDF code must include the header file 'extfnapiv3.h'. Defining the descriptor structureThe descriptor structure varies between scalar and aggregate UDFs. For scalar UDFs, define the structure 'a_v3_extfn_scalar' (see the section Scalar UDF descriptor structure in the User-Defined Functions Guide for syntax). For aggregate UDFs, define the structure 'a_v3_extfn_aggregate' (see the section Aggregate UDF descriptor structure in the User-Defined Functions Guide for syntax). Function entry pointsThe function entry points differ for scalar and aggregate UDFs. Information about calling patterns and how the Sybase IQ server invokes the scalar and aggregate UDF entry points is documented in the User-Defined Functions Guide, in the section UDF callback functions and calling patterns and its sub-sections.
Context Storage for Aggregate UDFsContext variables control whether the intermediate results of aggregate functions are to be managed by the UDF itself (forcing the IQ Server to run the UDFs serially), or whether the memory is to be managed by the IQ server. The context area is used to transfer or communicate data between multiple invocations of the UDF within the same query (particularly within OLAP-style queries). If the '_calculation_context_size' is set to 0, then the UDF will be required to manage all interim results in memory (forcing the IQ server to invoke the UDF sequentially over the data (vs. being able to invoke many instances of the UDF in parallel during an OLAP query). If the '_calculation_context_size' is set to a non-zero value, then the IQ server will manage a separate context area for each invocation of the UDF, allowing multiple instances of the UDF to be invoked in parallel. To make the most efficient use of memory, consider setting the '_calculation_context_alignment' to something smaller than the default (depending on the size of the context storage needed). For details on context storage, refer to the description of '_calculation_context_size' and '_calculation_context_alignment' in the section Aggregate UDF descriptor structure in the User-Defined Functions Guide. These variables are near the end of the descriptor structure. There is also a detailed discussion of the use of context storage in the section Calculation context in the User-Defined Functions Guide. IMPORTANT: Whenever it is required to use memory to store intermediate results within an aggregate UDF, remember to initialize the memory with the _start_extfn function, and to clean-up and de-allocate any memory with the _finish_extfn function. Calling patternsThe calling pattern for a scalar UDF is provided in the section Scalar UDF calling pattern in the User-Defined Functions Guide The various calling patterns for the entry points in an aggregate UDF are defined in the User-Defined Functions Guide, in the section Aggregate UDF calling patterns and subsequent sections. Selected SQL Callback FunctionsThis section describes some of the SQL callback functions available for use within the UDF code. Note that some of these call-back functions are specific to scalar UDF coding, or aggregate UDF coding: For additional details about callback functions, see the section UDF and UDAF callback functions in the User-Defined Functions Guide. Getting the UDF Input Parameters (scalar or aggregate)Use the 'get_value( )' call-back function to get input values for the UDF. Getting Subsequent Input Data (aggregate)Use the 'get_piece( )' call-back function (with an offset) for additional data, if the piece length is less than the total length. Returning the UDF Result back to the IQ Server (scalar)Use the 'set_value( )' call-back function to return the result back to the IQ server. Allowing Server to Terminate Functions (scalar or aggregate)Use the 'get_is_cancelled( )' call-back function within loops in the UDF code, to ensure that the user is able to cancel a query containing this UDF if the query is taking too long to complete. Generating an Exception (scalar or aggregate)Use the 'set_error( )' call-back function to set an exception that will be forwarded by the IQ server to the SQL client. Logging a Message (scalar or aggregate)Use the 'log_message( )' call-back function to log a message to the IQ Message Log (file named <database name>.iqmsg). Note that messages longer than 255 bytes may be truncated in the Message Log. Converting Data (scalar or aggregate)To convert data from one SQL Data Type to another, use the 'convert_data()' call-back function. Other OLAP-style Query Information Functions (aggregate)
Future Call-back FunctionsNote that additional call-back functions will be provided over time. Please refer to the User-Defined Functions Guide for the complete list of SQL call-back functions. Declaring a SQL Function to Access the UDFAfter the UDF code has been written and compiled, the next step is to create a SQL Function that will invoke the UDF from the appropriate library file, sending the input data to the UDF. There are many styles of SQL Functions. The syntax versions for declaring SQL functions for UDFs are provided in the User-Defined Functions Guide, in the sections titled Declaring a Scalar UDF and Declaring a UDAF. There are also descriptions in the User-Defined Functions Guide on how to create SQL functions using Sybase Central, in the sections titled Declaring a scalar user-defined function in Sybase Central and Declaring a user-defined aggregate function in Sybase Central. Examples of SQL function definitions are supplied in the User-Defined Functions Guide along with the various code samples. Note: The sample code for the 'my_interpolate' function has several variables named with the prefix 'preceeding_'. Although the code will compile and execute properly, these variables should be renamed to 'preceding_' to correct the spelling. SQL function definitions and associated SQL queries that employ window framing should use the correct spelling of the keyword 'PRECEDING'. See Problems with the 'PRECEEDING' keyword below. Installation, Upgrade and UninstallationSybase IQ should be shut down during UDF file maintenanceAs with any application which employs external libraries, it is very important that the application not be attempting to access an external library while the associated library file is being moved, overwritten or deleted. Since libraries are automatically loaded whenever an associated SQL function is invoked, it is important to follow these steps in the exact order whenever performing any type of maintenance on existing UDF libraries:
For additional details about external libraries, refer to the section Managing external libraries in the User-Defined Functions Guide. For syntax on revoking and granting execute permissions, see the section Granting and revoking permissions in the User-Defined Functions Guide. Runtime Configuration and TestingAfter UDF external code has been coded, compiled and linked, and the corresponding SQL functions and/or stored procedures have been defined, the UDFs are ready to be tested. UDFs should be run in a read-only node in a multiplex environmentIn order to ensure the safest environment possible, it is highly recommended that UDFs be installed and invoked from a read-only IQ server node in a multiplex installation. Disable UDF execution in critical IQ server nodesTo disable the ability to execute external UDFs in a particular IQ server, the administrator may specifiy the following flag in the startup command or configuration file:
See the section Enabling and disabling user-defined functions in the User-Defined Functions Guide. Debugging UDFs with a Safer Execution ModeTo catch some of the more serious API errors, set the Sybase IQ option 'external_udf_execution_mode' to 2 while testing. For increased performance, this setting should be reduced back to 1 or 0 at the completion of testing. For additional details regarding this execution mode, see the section Controlling error checking and call tracing in the User-Defined Functions Guide. Problems with the 'PRECEEDING' keywordNote: In the Sybase IQ 15.1 System Administration Guide, Volume 2, there is a typo in the paragraph title "UNBOUNDED PRECEEDING and FOLLOWING" on the page Window framing in Chapter 2: Using OLAP ->Analytical Functions. The proper spelling of the keyword is 'PRECEDING' (remove the double-E). SQL function definitions and associated SQL queries that employ window framing should use the correct spelling of the keyword 'PRECEDING'. How to enable full tracing in Sybase IQConsider turning on full tracing within Sybase IQ in a debug environment. To enable IQ tracing, add the following flags to the Sybase IQ server startup command line or the Sybase IQ config file: -zr all -zo filename where filename is the path and filename of the tracing output file. Location of Log FilesLogfiles for the database are generally co-located with the database file and configuration file. On Unix platforms, there are two files named after the database instance, one with a .stderr extension and one with a .stdout extension. On Windows, the stderr file is not generated by default by the start_iq utility. To capture the stderr messages along with the stdout messages under Windows, run the iqsrv15.exe directly, redirecting the stdout and stderr as follows: iqsrv15.exe @iqdemo.cfg iqdemo.db 2>&1 > iqdemo.stdout The Windows output messages are slightly different from what is generated on Unix platforms. The first execution of a UDF will take longer due to library loadThe Sybase IQ server will not load the library containing the UDF code until the first time the UDF is invoked. The first execution of a UDF residing in a library that has not yet been loaded may be unusually slow. After the library is loaded, the subsequent invocation of the same UDF or another UDF contained in the same library will have the expected performance. Note that libraries may be manually unloaded with the stored procedure SA_EXTERNAL_LIBRARY_UNLOAD and will not be re-loaded after the IQ server is stopped and restarted. In environments where after-hours maintenance operations require a shutdown and restart of the IQ server, it is advisable to run some test queries after the Sybase IQ server has been restarted. This will ensure that the appropriate libraries are loaded in memory for optimal query performance during business hours. Additional InformationSybase Product DocumentationProduct Manuals are available from Sybase.com, through the following navigation: From this page, select Sybase IQ from the Product menu, which will bring up the documentation sets for all releases Sybase IQ. Select Sybase IQ 15.1 from the list of Documentation Sets. This will bring up all of the documents for release 15.1 of Sybase IQ. Direct Link: http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.help.iq.15.1/title.htm Partner Certification ReportsPartner Certification Reports describe the interaction of third-party products with Sybase IQ, RAPStore, and other Sybase database types. To see the list of Partner Certification Reports available, use the navigation path: The direct link to the Partner Certification Reports is: Sybase Partner ProgramFor more information about the Sybase Partner Program, please visit the Sybase Partner Program website. Vendor Contact InformationVendor Name: Fuzzy Logix, LLC Address: 5900 Graburns Ford Dr., Charlotte, NC 28269 Technical White PapersTechnical White Papers discuss topics of a technical nature, and augment Sybase Product Manuals. To see the list of Technical White Papers available, use the navigation path: Direct Link: Sybase e-ShopSybase provides product and technical documentation online at MySybase.com, including the certification reports written by the Sybase Interoperability Group. Paper versions of product manuals may also be ordered through Sybase's online store, http://e-shop.sybase.com.
|
|||||||||||||||||||||||||||||||||||||||||||||