Home | The Company | Publications | Products | Links | Tips | Jobs |
---|
ProblemADASAV sporadically abends at the end of the back-up job with A78-18. It happens at the Virtual Tape Server (VTS) only and the JESx job log displays the following message:(snip) IEC705I TAPE ON 0432,L02649,SL,COMP,ADAP4BKF,ADASAVS.SAVE,ADABAS.PROD.DB4.BACKUP IEA794I SVC DUMP HAS CAPTURED: 523 DUMPID=040 REQUESTED BY JOB (ADAP4BKF) DUMP TITLE=COMPON=IOS,COMPID=SC1C3,ISSUER=IECVPST,PSTFRRTN IEA705I ERROR DURING FREEMAIN SYS CODE = A78-18 ADAP4BKF ADASAVS SAVE 00 IEA705I 00F75880 009F9068 009F9068 00000300 00980000 0E653000 IEA995I SYMPTOM DUMP OUTPUT 531 SYSTEM COMPLETION CODE=800 TIME=21.08.49 SEQ=28035 CPU=0000 ASID=0077 PSW AT TIME OF ERROR 070C0000 80FF2380 ILC 0 INTC 00 (snip) DD-statement for back-out dataset XXDDSAVE1 DD DSN=ADABAS.&ENV.&VAULT..&DB.BACKUP.&TYPE(+1), XX DISP=(,CATLG),UNIT=(&UNIT,&DRIVES,DEFER), XX VOL=(,RETAIN,,&VOLS),SPACE=(TRK,(&TRK,&TRK),RLSE), XX DCB=SYS2.MODEL,BUFNO=38 IEFC653I SUBSTITUTION JCL - DSN=ADABAS.PROD.DB4.BACKUP.FULL(+1),DISP=(,CATLG), UNIT=(CART,,DEFER),VOL=(,RETAIN,,255),SPACE=(TRK,(1665,1665),RLSE), DCB=SYS2.MODEL,BUFNO=38 (snip)Remark: The parameters "UNIT=CART" and "UNIT=3590-1" make it very easy to switch between VTS and 3590. Possible SolutionsSL24, Technical Paper #520226 recommends:Reducing BUFNO has proven to be a temporary workaround. For further analysis, please contact IBM, as some APARs are available concerning this issue:
IBM recommends to apply fixes OW57695 & OW57553 as well as OA15735 (testpackaging status) SAG Technical Support also recommends to apply ZAP AO742024 for ADA742 or AO741020 for ADA741.
Regarding support request #616706, SAG wrote: "Due to IBM APARs OA1029 and OA15735 (not official yet), we will close this request. If further issues arise, IBM is to be the initial contact. Discussed the BUFNO parm and settings; this is something that we cannot recommend, it is only a work-around to decrease it, until the APARs are applied." Some Companies switched from VTS to 3590 cartriges to avoid the problem. RemarksAPARs, PTFs, ZAPsApplied: IBM APAR: OW57695 and OW57533 IBM PTFs: UA06858,UA02072,UA06783 SAG ZAP AO742024 for ADA742 Not applied (waiting to be released): IBM APAR OA10292 with PTF UA21361 IBM APAR OA15735 are in 'TESTPACKAGING' status BUFNO
SL24, Technical Paper #337, Setting BUFNO DCB Parameter
When ECKD devices and cylinder I/Os are in use, many more buffers are needed. This is because a cylinder read needs up to 30 QSAM buffer. Since up to 8 QSAM buffers may be already filled but not yet written out, the recommended minimum would be a BUFNO of 38 to avoid long QSAM waits. ADASAV writes out with a variable block size, depending on how much used data of complete ASSO- or DATA blocks in sequence will fit within 32K. For that reason, it may be that 8 QSAM blocks do not make up 240K. Therefore, QSAM may write out more than 8 buffers in one I/O, and the minimum DCB=BUFNO should be slightly higher for good performance; a BUFNO specification of 42 should be okay.
An even higher value may improve performance, but to a lesser degree.
Rainer Herrmann, Software AG Germany, wrote 1998 an article about "Performance With ADASAV."
Conversation Between a Customer and IBMItem BDC000029949 Source..........: CA PDDB0 Last updated....: 20060504 Abstract........: A878-18 Error During Freemain USERS: ADABASE BACKUP UTILIY PROBLEM SUMMARY: A878-18 abend SOLUTION: Apply Maintenance PROBLEM DETAILS: Customer: Not sure where to go with this one. Running OS/390 V2R10. ISV Produc ADABAS backup utility receives the following recursive freemain abends DUMP TITLE=COMPON=IOS,COMPID=SC1C3,ISSUER=IECVPST,PSTFRRTN IEA705I ERROR DURING FREEMAIN SYS CODE = A78-18 HPDAJ101 ADARUN ADASAV 00 IEA705I 00FB9E80 009DFB60 009DFB60 00000300 00A00000 0ECAB000 IEA995I SYMPTOM DUMP OUTPUT SYSTEM COMPLETION CODE=800 This backup utility has been working fine for a number of successive runs daily for over a few weeks. Did a bunch of searches, but can't find anything pertinent to this particular problem or recent. Any insight would be much appreciated. Have many SVC dumps to choose from IBM: have you checked with the makers of ADABAS? The abend indicates that something is trying to free some private storage that is fixed. Since ADABAS is receiving this abend, there is a good chance they are doing the freemain. If you like, terse one of the ABENDA78 dumps and send it to me. I can have a quick look. Send it to CS56387 at TORIBM. Customer: I appreciate your looking at the dump for confirmation. I have NJE'd dump: IS10298.ETR79344.SVCDUMP.APR804 to TORIBM.CS56387. I have told the ADABAS support guy that it is their problem in all likelihood and pursue it with the ADABAS vendor (Software AG). IBM: I downloaded and untersed ISC.PMR79344.B035.DUMP01. It looks like we (IBM) needs to investigate further into this problem. The dump is of an ABEND800. However, the SYSTRACE reveals that the first abend is an ABENDA78-18. This abend indicates that a FREEMAIN w attempted against private storage but part of it is fixed. The freema that caused this error is: SVC 78 078C0000 00E116BA 00000003 00A00000 0ECAB000 It was trying to free x'0ECAB000' for x'A00000' bytes. The SVC 78 was issued out of the LMOD IGG0201Z+x'26BA'. I did an AMBLIST on my 2.10 sandbox to find the correct mod. The SVC 78 is issued out of IGG0201X + x'172'. This is DFP module responsible for cleanning up resources. IBM: I would like to verify the csect and offset that issued the SVC78 as I am not sure that it would match our sandbox. Can you run an AMBLIST LISTLOAD OUTPUT=XREF,MEMBER=IGG0201Z an let me know the csect and offset where IGG0201Z + x'26BA' falls? Can you also supply me with the rmid of this csect? If you don't mind, can you also e-mail this XREF listing to my attenti to dfsms@ca.ibm.com. Customer: AMBLIST sent via email. Also: IGG0201Z is at HDZ11F0/UW69673. IGG0201Z + x'26BA' also falls at X'172' in IGG0201X. IGG0201X is at HDZ11F0/UW81169 We are planning on putting on OA04182 with prereqs and coreqs to fix this problem this weekend. IBM: Thanks for the info on IGG0201X. As mentioned above, similar problems have been reported earlier. (another is 01964,487). But these have gone unresolved with requests for tracing and trapping with SLIP. Do you know if this problem is readily recreatable? The lead up to the problem in these other scenarios is identical to what we see here. CLOSE is issued by ADABASE routine for DDSAVE1. CLOSE Executor error handling routine, IGG0201B issues EOV (scv37). EOV issues SVC0 to write out the buffers. We see SSCH to tape drive 0443, then the normal I/O interrupt, but then there is another SSCH to the same IOSB. It is while this second I/O is in progress that we issue the SVC78 FREEMAIN that results in the abend878-18, since some o the buffers must still be fixed as a result of the I/O in progress. Have you heard back from the ADABASE folks on this issue? Let me know how recreatable this might be. Then I will check with level2 as to what kind of tracing/traping they would like to see for this problem. Customer; AMBLIST of IGC0005E sent via email. Offset x'511E' is in CSECT IFG0551L which is at base HDZ11F0. I've been in communication with ADABAS support guy who has received on zap so far from SAG. I have seen the zap and it is zap AO742024 for ADA742. I've questioned its pertinence as we are experiencing a FREEMAIN problem vs. a GETMAIN. I've asked ADABAS support to get further clarification. Here is the title of the zap: PROBLEM : Jobs using tape datasets experience GETMAIN failure, * because the size of the QSAM buffers allocated has * increased by up to eight times for datasets where * Large Blocks (block sizes of up to 256K) are in use. Also, did you see my comment earlier regarding that we will be applyin suspect fix OA04182 with pre's and co's above this weekend? Please comment on that. IBM: Yes, I did see your comments about apar oa04182. Though it does not directly describe our sysmptoms, it does mention outstanding EXCPs, which may tie into the I/O in progress and freemain attempt. Also in that other scenario, development are recommending the application of this fix, as well as apars OW57695 and OW57553. Do you have these latter fixes applied? Customer: Fixes OW57695 & OW57553 are not applied but we will add those as well. IBM: I discussed this issue further with level2. They have been in touch with development, who recommend that maintenance that I listed, since the changes affect the channel end appendages, and may very well have an affect on this problem. Let me know once the maintenance has been applied. Customer: In answer to your previous question,the problem is not readily recreat eable, in that when they first started having it when the upgraded the ADABAS problem, they removed the 'bufno' option from their job and the ADASAV utility ran fine for a period of approx.2 weeks. Following that when it would no longer work last week, we had them remove their backu allocation from within VTS drives to outside physical 3490's and this how they have been running since, in order to circumvent. After this weekend's maintenance I have them direct their backups back to within the VTS and we will wait and see what happens. I've advised ADABAS support to refrain from applying any SAG maintenance as I want to see what happens with our maintenance first. IBM: Let me know if/when this problem should re-occur after the maintenan Customer: So far, so good after all the recommended maintenance (IBM only) was applied. ADASAV utility was changed back to within the VTS drives and worked fine yesterday. We will watch this for a period of time however (30 days) because the root cause still seems to be inconclusive (although EOV control block corruption is suspected in so IBM: Thanks for the additional feedback/update. I will set a follow up date a month down the road to check back with, unless I hear back from you sooner. Customer: Reason For Closure: The fixes corrected the Freemain problem within the ADASAV utility. Back to ADABAS Tips, Tricks, Techniques -- Overview |