Storr Consulting - ADABAS Phonetic Descriptors


Home	The Company	Publications	Products	Links	Tips	Jobs

Phonetic Descriptor

Default and User-defined

By Dieter W. Storr

Last update: 9 November 2006

Default ADABAS Phonetization

Phonetic descriptors are standard in ADABAS. The following example shows the definition of the phonetic descriptor AA by using the compress utility ADACMP

FNDEF='01,AA,20,A,DE,NU'
PHONDE='PA(AA)'

Phonetic descriptors must be built from alpha fields. Read logical is not possible. UQ is not permitted. It doesn't make sense to create primary key as PHONDE. Also, PHONDE cannot be a part of PE group or sub, super or hyperdescriptors.

As far as I know, ADABAS is still using the "Kölner Phonetisierung," a phonetic process created by IBM Germany. Description see IBM-Nachrichten 198 (1969), 925-931, from Postel H.J.

Phonetic descriptors are descriptors created from the first 20 bytes of alphanumeric fields. Basically, a routine eliminates the vowels -- the letter Y is in German a vowel -- from a field and the rest will be translated by a phonetic algorithm. For example, last name MAIER, MEIER, MEYER, and MEYR are translated/ciphered into value "1704". And this value will be stored into ADABAS inverted list. This is a very simple description only:

Meyer = Mr = stored as 1704
Mayer = Mr = stored as 1704
Maier = Mr = stored as 1704
Meir  = Mr = stored as 1704

If your language doesn't have a lot of vowels then it could be a problem to find similar sounding names, for example the Bohemian/Czechoslovakian name Hrdlicka. Therefore, German names are more suitable to build ADABAS phonetic descriptors than English ones.

For high sophisticated phonetically rules you can use special Soundex software or write a hyperexit to build a hyperdescriptor.

Civil service and some companies in Germany as well as the Los Angeles Times use phonetic descriptors with success.

User-defined Phonetization -- User Exit 3

This user exit may be used to perform user-defined phonetization. It is given control by the ADACMP utility or the Adabas nucleus whenever phonetic processing is required.

The user exit must develop a three-byte phonetic key using the value supplied. The address of the resulting phonetic key must be placed at 8(R1) before control is returned.

Parm   A fullword address of . . . 
-----  ------------------------------------------------------------
0(R1)  the four-byte length for the value to be phoneticized.  
4(R1)  the address of the value to be phoneticized. 
8(R1)  a three-byte location to contain the phonetic key. 
       This address is set to zero before the user exit and must be 
       set to the actual address during the user exit.

The call to the user exit is made using a standard BALR 14,15 assembler instruction. All registers must be saved when control is received and restored immediately prior to returning control to Adabas. The content of R15 is ignored.

ADAUX3   CSECT
ADAUX3   AMODE 31
ADAUX3   RMODE 24
         USING ADAUX3,12
         STM   14,12,12(13)
         LR    12,15
         L     6,0(1)                   GET A(LENGTH)
         L     6,0(6)                   GET ACTUAL LENGTH
         L     7,4(1)                   GET ADDRESS OF WORD
         L     7,0(7)
LOADCODE LA    8,CODE                   GET ADDRESS OF RESULT

(snip)

Click to get the entire assembler code of USER EXIT 3

Phonetic Descriptor Generated by NATURAL Programs

Paul Macgowan has sent the following program:

Remove all vowels A, E, I, O, U and the letter Y:
MACGOWAN becomes MCGWN
Change all letters C to K:
MCGWN becomes MKGWN
Change all letters G to J:
MKGWN becomes MKJWN
Change all letters M to N:
MKGWN becomes NKJWN
Change all letters Z to S:
NKJWN becomes NKJWN (No change as there are no Z's)
Change all letters V to F:
NKJWN becomes NKJWN (No change as there are no V's)
Remove any duplicate letters from the soundex key (eg if there was
MNKKJ then this would be changed to MNKJ)
NKJWN becomes NKJWN (since there are no duplicates)
Add the first letter of the original surname to the beginning even if it is a vowel or one of the other special letters C,G,M or Z
NKJWN becomes MNKJWN

Using the First Given name HAMISH
Apply rules 1 through 7 (not 8) to the first given name:
HAMISH becomes HNSH

Put the surname soundex (MNKJWN) into the first seven characters of the soundex key and put the first given soundex key (HNSH) into the last three characters of the soundex key. Truncating the both values to seven and three characters respectively.

So HAMISH MACGOWAN has a soundex key of MNKJWN HNS

Notice in this case that because MNKJWN is only six characters that a space is added to make sure that is occupies the first seven characters. Also notice that HNSH has been truncated to three characters.

Click to get Natural Soundex code #1

Paul Mercier has sent the following Soundex Code routine

Once the 4 character code is rendered, he stores it in the file needed and use as part of a Super Descriptor.

Click to get Natural Soundex code #2. Updated on Nov 20, 2006

General Info About Soundex

The original Soundex algorithm was patented by Margaret O'Dell and Robert C. Russell in 1918. The method is based on the six phonetic classifications of human speech sounds (bilabial, labiodental, dental, alveolar, velar, and glottal), which in turn are based on where you put your lips and tongue to make the sounds.