innodb data compression


innodbは2つのファイルフォーマットをサポートします.
innodb_file_format=Antelope/Barracuda
 
MySQL build-in innbase only support Antelope. Barracuda only supports in innodb-plugin.
 
Install innodb-plugin as shared library:
 
How to build innodb-plugin and use innodb-plugin in MySQL server:
1. How to build:
1.1. tar -zxf mysql.tar.gz
1.2. ./configure --prefix=/path/to/mysql/--with-plugins=innodb_plugin
1.3. make & make install
 
2. How to configure:
Add the following lines in MySQL configure file:
      ignore_builtin_innodb
plugin-load=innodb=ha_innodb_plugin.so;innodb_trx=ha_innodb_plugin.so;
innodb_locks=ha_innodb_plugin.so;innodb_lock_waits=ha_innodb_plugin.so;
innodb_cmp=ha_innodb_plugin.so;innodb_cmp_reset=ha_innodb_plugin.so;
innodb_cmpmem=ha_innodb_plugin.so;innodb_cmpmem_reset=ha_innodb_plugin.so


Then start MySQL, it will use the innodb plugin instead of built-in innobase.

Build innodb plugin from source code:(I met an error when make.....so ignore this way to build )

Barracuda :

default_storage_engine=InnoDB
innodb_file_per_table=1
innodb_file_format=barracuda
innodb_strict_mode=1



table :
ROW_FORMAT=COMPRESSED KEY_BLOCK_SIZE={2,4,8}


1. When to USE:

Compression works best on tables that include a reasonable number of character string columns and where the data is
read far more often than it is written.

1.1 Data Characteristics:
Character strings often compress well
, whether defined in CHAR, VARCHAR, TEXT or BLOB columns.
On the other hand, tables containing mostly binary data (integers or floating point numbers) or
data that is previously compressed (for example JPEG or PNG images) may not generally compress well, significantly or at all.


1.2 Workload Characteristics:
If the application is dominated by reads, rather than updates, fewer pages need to be reorganized and recompressed after the index
page runs out of room for the per-page “modification log” that InnoDB maintains for compressed data. If the updates predominantly
change non-indexed columns or those containing BLOBs or large strings that happen to be stored “off-page”, the overhead of
compression may be acceptable. If the only changes to a table are INSERTs that use a monotonically increasing primary key,
and there are few secondary indexes, there is little need to reorganize and recompress index pages. Since InnoDB can “delete-mark”
and delete rows on compressed pages “in place” by modifying uncompressed data, DELETE operations on a table are relatively efficient.

1.3 Compressing in database or appliance:

1.4 Configuration characteristics:
When a page of a compressed table is in memory, InnoDB often uses an additional 16K in the buffer pool for an uncompressed
copy of the page. The adptive LRU algorithm in the InnoDB storage engine attempts to balance the use of memory between
compressed and uncompressed pages to take into account whether the workload is running in an I/O-bound or CPU-bound manner.

1.5 Choosing the compressed page size:
The compressed page size should always be bigger than the maximum record size.Setting the compressed page size too
large wastes some space, but the pages do not have to be compressed as often. If the compressed page size is set too small,
inserts or updates may require time-consuming recompression, and the B-tree nodes may have to be split more frequently,
leading to bigger data files and less efficient indexing.


2. Compression and innodb buffer pool:

innodb buffer pool will save the un-compressed page and compressed page.

3. Compression and innodb log file:

Before a compressed page is written to a database file, InnoDB writes a copy of the page to the redo log (if it has
been recompressed since the last time it was written to the database). This is done to ensure that redo logs will
always be usable, even if a future version of InnoDB uses a slightly different compression algorithm.