2014년 7월 17일 목요일

Defragmentation InnoDB table on MariaDB 10.0

Sometimes some of our service, there's a lot of DELETE but some case free space of removed rows can not be reused (Especially AutoIncrement primary key).
We have looking for online defragmentation feature for InnoDB long time.
MySQL 5.6 Facebook patch have this feature (I found this a few weeks ago), but not in MariaDB.
So I ported Facebook InnoDB defragmentation feature to MariaDB 10.0.

You can see more detailed explanation and patched source code from below facebook github and kakao tech blog.

https://www.github.com/facebook/mysql-5.6
http://kakao-dbe.blogspot.kr/2014/07/defragment-innodb-table-on-mariadb-100.html


Basic test (Defragmentation efficiency)

After insert 1 million rows, removed some rows(50% from tb_t50, 30% from tb_t30, 20% from tb_t10) from target tables.
And compare three table's data pages count after run "ALTER TABLE tb_txx DEFRAGMENT" command.

CREATE TABLE `tb_t50` (
  `fdpk` int(11) NOT NULL AUTO_INCREMENT,
  `fd1` char(100) NOT NULL,
  `fd2` char(100) NOT NULL,
  PRIMARY KEY (`fdpk`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

CREATE TABLE `tb_t30` (
  `fdpk` int(11) NOT NULL AUTO_INCREMENT,
  `fd1` char(100) NOT NULL,
  `fd2` char(100) NOT NULL,
  PRIMARY KEY (`fdpk`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

CREATE TABLE `tb_t10` -- // ==> Actually this table name must be tb_t20 )
  `fdpk` int(11) NOT NULL AUTO_INCREMENT,
  `fd1` char(100) NOT NULL,
  `fd2` char(100) NOT NULL,
  PRIMARY KEY (`fdpk`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;



Test scenario


  1. Insert 1048576 rows to tb_t50, tb_t30, tb_t10(fdpk column value is auto incremented from 1)
  2. Restart MariaDB server for clearing InnoDB buffer pool
  3. Check loaded page count from InnoDB buffer pool after running below query from tb_t50
    select count(*)
      from tb_t50 use index(primary)
      where fdpk between 1 and 1000000
      order by fdpk;
    select count(*)
      from information_schema.innodb_buffer_page
      where table_name='`test`.`tb_t50`';
  4. Delete 50% of rows from tb_t50, 30% of rows from tb_t30, 20% of rows from tb_t10
    delete from tb_t50 where fdpk%2=0;
    delete from tb_t30 where fdpk%3=0;
    delete from tb_t10 where fdpk%5=0;
  5. Restart MariaDB server
  6. Run select query from tb_t50 to load disk data page to InnoDB buffer pool
  7. Checking loaded page count from InnoDB buffer pool for tb_t50
  8. Doing 6) ~ 7) for tb_t30 and tb_t10



Test result








According to the first graph, pages will be merged automatically when removing 50% of rows from pages. 
But 1/3 and 1/5 case, pages are not merged automatically and stay intactly.
Average row count per page is decreased in tb_t20 and tb_t30 table (According to second graph).

But after defragmentation, average row count per page is getting same among three tables (4th graph). 
And loaded page of InnoDB buffer pool is getting smaller than before(before deletion)  (3rd graph).


Defragmentation Performance and Processing time

Below graph shows disk read iops according to innodb_defragment_frequency system variable.




  1. innodb_defragment_frequency = 100
  2. innodb_defragment_frequency = 1000
  3. innodb_defragment_frequency = 10
  4. innodb_defragment_frequency = 50


innodb_defragment_frequency is used to determine how many times merge method will be call per each second.
So greater value set to this variable more pages would be merged per second. So total defragmentation time is getting low.
But InnoDB defragment thread will read more pages if you set higher value to innodb_defragment_frequency system variable.
If you have fast ssd or whole data could be loaded into InnoDB buffer pool you should increase this variable.
Unless you should decrease this value.

Warning

You can not use this InnoDB defragmentation feature to shrink InnoDB tablespace's disk size.
This feature merge pages and free some page to reserved space of table, not to operating system.

댓글 없음:

댓글 쓰기