Home | The Company | Publications | Products | Links | Tips | Jobs |
---|
GeneralDifferent databases compress their records in different ways and with different success. But ADABAS is top. The ADABAS values are not based on the same files/tables:Databases Compression Rates --------- ----------------- ADABAS: 60% - 80% (92.5%) Oracle: 18% - 38% DB2 Viper: 58% - 60% MS SQL Server: none ADABASData CompressionFields (columns) in an ADABAS file (table) are normally compressed. ADABAS eliminates all trailing blanks from alphanumeric fields and all leading zeros in numeric fields. As a result, unpacked fields are packed, multiple empty fields within a record are compressed, and all empty fields at the end of a record are completely suppressed. ADABAS automatically compresses the data on field and record level.
ADABAS compresses files between 60% and 80%. Depends on the field attributes, the compression rate can be more than 80%, for example files with many alpha, MU and PE fields. I unloaded (ADAUNLD) a file/table with many fields/columns in one PE group, decompressed (ADACMP DECOMPRESS), and compressed (ADACMP COMPRESS). The report of the compress process for this file showed: COMPRESS Processing Statistics (data only -- not index): -------------------------------------------------------- Number of records read 8,489,961 Number of incorrect records 0 Number of Compressed records 8,489,961 Raw data 47,678,957,988 bytes Compressed data 3,776,461,073 bytes Compression rate 92.5 % Largest Compressed Record 1,781 bytes To see the entire output of the utilties, please click here. Some data compression examples from other companies, published on SAG-L: - Customer A = 60 % - Customer B = 86.4% -- number of records = 3,981,371 largest compressed record = 292 bytes - Customer C = 77.8% -- number of records = 2,999,582 largest compressed record = 346 bytesIndex Compression In addition to the standard index (ASSO) compression, ADABAS version 7 uses Forward Index Compression (parameter INDEXCOMPRESSION=YES). Within an index block, the part of the index value that is identical to the forward part of the previous index value is suppressed: for the Normal Index approx. up to 70%. OracleChris Eaton, Product Manager, IBM, writes:Oracle of course has table compression (and has had it for a while now). However, there is a big difference in the implementation which appears to result in big differences in the compression rates. In Oracle, compression occurs at the data block (page) level. So as you use the bulk loader, Oracle will look for repeating column values on the page and replace those values with a key. The key is stored in the page header and a pointer is stored in the row on that page. The problem with this method is that not only is it not looking for larger patterns that may occur across page boundaries but if there is a very common string, it is stored multiple times in every page header that contains that row value. TPC-H is a well-known data warehouse benchmark in which each vendor uses the same tables and same data. Oracle published a paper at the VLDB conference in 2003 on their compression capabilities. In this paper they showed compression rates for TPC-H tables. Lineitem table: 38% compression Orders table: 18% compression Entire database: 29% compression More on Ordix News 1/2007 about Oracle compression.
DB2 ViperChris Eaton, Product Manager, IBM, writes:In DB2 Viper, we look at the entire table to determine the repeated patterns, store them once, and then compress those values out of each page. In addition, DB2 will honor compression keys during insert and load while only the direct path loader in Oracle supports compression. Do any of these internal differences matter? They wouldn't if they produced the same results. But from publicly available compression results there is a big difference. Oracle license terms do not allow anyone to publish benchmarks without their prior consent so I can't tell you the differences that other customers are seeing. But I can tell you what Oracle published themselves and how that compares to DB2 Viper. TPC-H is a well-known data warehouse benchmark in which each vendor uses the same tables and same data. Oracle published a paper at the VLDB conference in 2003 on their compression capabilities. In this paper they showed compression rates for TPC-H tables. IBM ran the same tests on the same tables and here are the comparison results: Lineitem table: 58% compression Orders table: 60% compression Entire database: 59% compression Microsoft SQL ServerChris Eaton, Product Manager, IBM, writes:Compression in Microsoft SQL Server doesn't exist (at least not natively). In fact, according to support article, placing a SQL Server database on compressed volumes is not recommended and not supported. I have seen some references to using .Net compression libraries to call out from the database to some application code to do some compression but I don't think that's the same as DB2, Oracle or any other vendor's table compression. To read the entire BLOG about compression of Oracle, DB2 Viper and MS SQL Server, click here Back to ADABAS Tips, Tricks, Techniques -- Overview |