본문 바로가기
MySQL Internal/InnoDB Internal

MySQL의 History list에 관한 문서

by 모모레 2014. 1. 23.

다음의 내용은 다음 url의 내용을 번역한 것입니다.

역시나, 난 번역 전문가가 아니므로, 번역이 잘못 되었을수도 있구요.

그런게 있으면, 잘 아시는 분이 잘 좀 알려주세요~~


원본

http://www.pythian.com/blog/some-fun-around-mysql-history-list/


Some Fun with MySQL’s History List

Apr 27, 2012 / By Marco Tusa

Why this article?

First of all, because I had fun digging in the code.

Then, I was reading a lot about the improvements we will have in MySQL 5.6 and about some already present in 5.5. Most of them are well covered by people that certainly know more than me, so I read and read, but after a while became curious. I began reading the code and performing tests. I started to compare versions, like 5.1 -- 5.5. -- 5.6. One of the things I was looking for was how the new Purge thread mechanism works and what were its implications. I have to say that it seems to work better than the previous versions, and the Dimitry blog (see reference) seems to confirm that.

So again, why the article? Because I think there are some traps here and there, and I feel the need to write about them. The worse behavior is with MySQL 5.5. This is because in 5.5 we have an intermediate situation, where the purge is not fully rewritten as in 5.6, but also not bound to the main thread.

굳이 번역 하지 않아도 될거 같아서 그냥 패스~~


What is the history list?

MySQL uses (from Innodb 1.1 MySQL 5.5) 1 to 128 Rollback segments, each able to support 1023 transactions. Each transaction is assigned to one of the rollback segments and remains tied to that rollback segment for the duration of the transaction. This enhancement improves both scalability (higher number of concurrent transactions) and performance (less contention when different transactions access the rollback segments).
History list is tightly bound to the undo log representing the number of Header Pages related to undo log segments, segments that are related to finalize transactions, commit, or roll back. That’s it: History list represents the number of not-yet-flushed segments in the undo log.

InnoDB 1.1을 사용하는 MySQL 5.5는 1 부터 128개의 Rollback segment를 사용한다. 이 Rollback Segment는 각각 1023개의 트랜잭션을 지원할 수 있다. 각각의 트랜잭션은 하나의 rollback segment를 사용하며 트랜잭션이 유지되는 동안 하나의 rollback segment를 사용한다. 이 구조는 확장성(동시 진행할 수 있는 트랜잭션의 수)과 성능(롤백세그먼트를 사용하려는 트랜잭션들 사이의 경쟁)에 다 효과가 있다.
History List는 undo log segment(transaction이 commit또는 rollback해야 하는)에 연관된 header page의 수를 나타내는 undo log에 묶여져 있다. 즉, History list는 undo log안의 아직 flush 되지 않은 segment의 수를 나타낸다.


Old and new

Previously, the purge thread was directly controlled by the main thread in InnoDB, causing serious side effects (more info here). The main change was moving out the purge thread and allowing it to run it isolated so that it doesn’t impact other processes. The move was done in Innodb 1.1 (present in 5.1 and 5.5).
But that is not all: The new version of purge has a mechanism that allows it to be more or less aggressive in relation to the increase or decrease of the History List length. This option is enabled in the 5.6 release and is taking the innodb_purge_threads as the “up to value” it can use to increase the number of threads for purging.

이전의 purge thread는 InnoDB의 main thread에서 직접적으로 처리하게 되어있어서 생각보다 심각한 문제를 야기시키곤 했다. 그래서 InnoDB 1.1부터 purge thread로 분리되어 독립적으로 실행될 수 있게 변경되었다.
그러나 그게 전부가 아니다. purge의 새 버젼은 History list의 길이의 증가나 감소에 따라 동작할 수 있는 메커니즘을 제공하는데 그것은 5.6에서 부터 추가된 innodb_purge_threads이다. 이것은 purge thread의 갯수를 변경할 수 있게 해준다.


Different behavior

An obvious fact of the new Purge mechanism is that its behavior is quite different from the previous one, i.e. 5.0 (innoDB 1.0) or 5.1. In the previous versions of InnoDB, we were taught to consider the History List as something that should always be close to 0; not always, but as much as possible.

새로운 purge 구조의 명백한 사실은 이전과 다르게 동작한다는 것이다. 이전 버젼에서 사용자는 History List가 0에 가까워 지는지 항상 고려해야만 했다. 하지만, 항상 고려한다는 것은 사실 가능하지는 않다.

Frankly speaking, that was always a little bit confusing, but the MANUAL suggested in several place to keep it between reduced numbers:

솔직히 얘기해서, 이것은 좀 혼란스럽게 한다. 그러나, 매뉴얼에서는 그 수를 줄이는 사이에 그것을 유지하기 위해 여러 방법들을 제안한다.

“If you insert and delete rows in smallish batches at about the same rate in the table, the purge thread can start to lag behind and the table can grow bigger and bigger because of all the “dead” rows, making everything disk-bound and very slow In such a case, throttle new row operations, and allocate more resources to the purge thread by tuning the innodb_max_purge_lag system variable.” --InnoDB Multi-Versioning

하나의 테이블에 같은 비율로 insert와 delete가 주기적으로 작은 배치잡으로 동작하게 된다면, purge thread는 뒷단에 log가 발생하기 시작한다. 그리고 그 테이블은 그 dead한 row들 때문에 점점더 커지기 시작한다.    이것은 시스템 성능을 떨어뜨리는 원인이 될 수 있다. 이와 같을 때는 innodb_max_purge_lag 파라메터를 튜닝하여 purge thread에 더 많은 리소스를 할당할 수 있게 해야 한다. --InnoDB Multi-Versioning

A classic scenario for such issue is an application taking track of the activity on the network that requires to write a huge number of small inserts into the database. From my tests, I have seen an incredible number of entries in the history list in 5.5 that were not present in the previous InnoDB version and that are not present again in 5.6.

이런 문제를 위한 가장 기본적인 시나리오는 작은 insert가 많이 발생하게 하는 application이라고 할 수 있다. 나의 테스트를 통해 나는 다른 버전에서는 본적 없는 MySQL 5.5에서 history list의 값으로 엄청나게 많은 수가 보이게 한 적이 있다.

The point is that it could happen to have so many transactions, doing INSERT, UPDATE, or DELETE, that the History and un-flushed undo log grow too much. To prevent issues, we should tune the value of the innodb_max_purge_lag in order to allow InnoDB to complete the PURGE operations.

요점은 즉, 많은 트랜잭션이 발생하고 있다면 flush 되지 않은 undo log와 history는 계속적으로 증가한다는 것이다. 이것을 방지하기 위해, 우리는 innodb_max_purge_lag를 튜닝하여 InnoDB가 PURGE 작업을 완벽히 수행할 수 있게 해야 한다는 것이다.


Innodb_max_purge_lag is the highest number we want to have in history list, above which Innodb will start to apply an indiscriminate delay in pushing the operations.

innodb_max_purge_lag는 history list가 가질 수 있는 가장 높은 수를 가진다. 그리고 나면, InnoDB는 작업이 진행되면서 무차별적으로 delay가 발생하게 할 것이다.


The formula is quite simple:
((purge_lag/innodb_max_purge_lag)×10)--5 milliseconds.

Or following the code we have:
float ratio = (float) trx_sys->rseg_history_len/ srv_max_purge_lag;
((ratio - .5) * 10000);

If we have a History list of 1200 and have set innodb_max_purge_lag to 1000, the result should be:
((1200/1000)X10)-5= 7 ms delay for operation. following the manual
((1200/100) -5) * 10000 = 7000 <-- Microsecond following the code


만약, 1200개의 history list를 가지고 있고, innodb_max_purge_lag가 1000이라고 한다면,  작업당 7 ms의 작업 지연이 발생하게 되고, 코드에 따르면 7000 micro second만큼의 작업 지연이 발생하게 된다.


All match and delay will be 7 ms. Also, the max limit in previous versions for the delay was of 4295 seconds!! In 5.5/5.6, we see a different way of managing the max number of seconds:
if (delay &gt; srv_max_purge_lag_delay) {
delay = srv_max_purge_lag_delay;
}


모든 지연시간은 7ms가 된다. 이전 버젼의 지연되는 시간의 max limit는 4295초 였다. 5.5/5.6에서는 다음과 같은 식으로 계산된다.


Srv_max_purge_lag_delay max value is 10 seconds. So in the worse case scenario, the max delay will be 10 seconds.

srv_max_purge_lag_delay 의 최대값은 10초이다. 그래서, 이와 같은 시나리오라면, max delay 값은 10초가 된다.


In the past, we were used to see the History list going up and down (relatively) fast, so the force delay was playing its role efficiently. At the same time, we knew that all operations in the Main threads where slowed down, so the forced delay was something we had to leave with or worse things could happen, like the need to perform an emergency page flush from the buffer pool to allow REDO to recover space.

과거에, 우리는 History list가 빠르게 값이 올라가고 내려가는 것을 보았다. 그래서, 이와 같은 강제적인 지연은 그것에 기능에 효과적이었다. 동시에, 우리는 main thread의 모든 작업의 진행이 느려지는 것도 확인했고, 그래서 이와 같은 강제적인 지연은 다른 부수적인 문제를 일으켰다.


But something has changed… For the better, obviously, but we must be careful.

그런데 이것이 좀 변했다. 나아졌으나 우리는 좀 조심해야 해졌다.

It’s better because now the purge thread works independently and can scale, and pending undo flushes does not slow down the other operations. Also in 5.6, MySQL could be more or less aggressive in relation to the History list to purge.

purge thread가

But these operations remain something we should monitor and tune, for two main reasons:

- Space taken by undo log segments is still an issue, and now that the number is increased, it could be even worse.

- Setting a wrong value for innodb_max_purge_lag could kill our performance.
Let us start digging a bit

First of all, History list and purge log thread are still very badly documented in the InnoDB Status Report.

In 5.5, we can count on the History list information, number of transaction purge has being up to, and the number of purged records. Finally, in 5.6 we have the generic state. There isn’t too much here; better info like number or running threads, real segments used (1-128), number of Header Pages in the segments, and dimension (at least pages as Percona does) would be nice to have. Undo log is currently stored inside the system tablespace. This means that IT WILL BE IMPOSSIBLE to shrink the size of the table space once undo log has taken a huge amount of space.

That’s it: 80% or more of the size of a system table space is for the undo log when using innodb_file_per_table. This was already true when InnoDB was using a single set of segments (1023), but now that it can scale up to 130944 and supports better and more transactions, the size on the disk can explode.

Some numbers to have better understanding:

-History list: 359585

-Insert/s: 35649.67

-Pages to flush in the undo log: 429126

-Means that in the undo log it’s a total size of: ~ 6.7GB

Considering that the normal block size is 4K and each page is 4 operations, we will have 1,716,504 operations, assuming that each page is be sequential. This means 3 ms for Seek+half rotation then, 1 ms transfer for the first operation, then 2ms for the remaining will result in 12ms for each page delete on disk.

This means it will take 5149.512 seconds (85 minutes) at 6.7 Mb/s given the partial random write to flush the whole. Obviously, this number changes in respect to the available cache and available spindles. Also, more threads equals more capacity in write and less time, so the option innodb_purge_threads is more than welcome.
Setting the right value for innodb_max_purge_lag

When setting this value, we must keep into consideration the following:

- The value is the number of head pages representing an undo log relation to a running/executed transaction.

- Purge can apply only to complete transaction.

- Delay applies to all write operations inside InnoDB.

Nowadays it is not uncommon to have high numbers in history list. In this case, “just” 359,585 head pages is very important to correctly balance the delay point of start with the real load, like transactions/sec and the kind of operations that are ongoing on the server.

To clarify the relevance in the case of reads, and not only of writes, let me cite:

Running Very Long Transaction If you’re running very long transaction, be it even SELECT, Innodb will be unable to purge records for changes which are done after this transaction has started, in default REPEATABLE-READ isolation mode. This means very long transactions are very bad causing a lot of garbage to be accommodated in the database. It is not limited to undo slots. When we’re speaking about Long Transactions the time is a bad measure. Having transaction in read only database open for weeks does no harm, however if database has very high update rate, say 10K+ rows are modified every second even 5 minute transaction may be considered long as it will be enough to accumulate about 3 million of row changes. (Peter Z.)

But what can really harm your the system is the delay defined by the purge lag to improve the flushing. Assume we define it to innodb_max_purge_lag=200,000, and we do have the number of pending flush as for the above, 359585. Doing calculation as for the previous formula:

((359585/200000)X10)-5= 12.97925 ms delay for operation.

Hey that’s not too bad. I will delay only 12.97925 ms to operation/Insert to help the purge. But what is unclear is what an operation is for the delay, or rather, where does the delay really apply?

Ready? Are you sitting comfortably?

2) row0mysql.c
   
/*******************************************************************//**
Delays an INSERT, DELETE or UPDATE operation if the purge is lagging. */
static
void
row_mysql_delay_if_needed(void)
/*===========================*/
{
if (srv_dml_needed_delay) {
os_thread_sleep(srv_dml_needed_delay);
}
}

3)os0thread.c
   
/*****************************************************************//**
The thread sleeps at least the time given in microseconds. */
UNIV_INTERN
void
os_thread_sleep(
/*============*/
ulint tm) /*!&lt; in: time in microseconds */
{
#ifdef __WIN__
Sleep((DWORD) tm / 1000);
#else
struct timeval t;
t.tv_sec = tm / 1000000;
t.tv_usec = tm % 1000000;
select(0, NULL, NULL, NULL, &amp;t);
#endif
}

Do you get it? The delay is per ROW.

So assume that you have a system checking connections status and traffic, collecting statistics every minute for your 100000 connected users, each user generating at least 8 inserts, plus a consolidation operation to get on average 10 minutes for each insert. Each insert per user takes 0.002 second, and all traffic is managed by 300 threads.

100000 x 8 = 800000/300 = 2666 insert for transactions x 0.002ms each = 5.3sec to run all the operations.

Now, what if we apply the delay given we have History list quite high? I have to add the 12ms to the 0.002, which will give us 0.122 sec. This gives us 325 seconds (5.4 minutes) for each thread!!!!

Do not forget the consolidation process, which needs to run every 10 minutes. So it has to process (1 reads for user per minute x 100000 users) x 10 minutes and split in 10 threads doing it by user id set, assuming each read per user will take 0.001 ms (given already in memory) and 0.002 ms for write.

Without delay, we will have = 1,000,000 / 10 = 100,000 x 0.003 = 300sec (5 minutes) for thread. With delay, it will be 0.122 = 12200 seconds (203 minutes) for each operation.

Last but not least, the consolidation will collide with the inserts, causing possible increase of the delay because of the REPEATBLE_READ. Another possible issue is that the consolidation will cause pages to remain in a dirty state for too long, possibly causing serious issues in the REDO log in case we need to free space.

I did push some numbers a little bit just to make the scenario more clear. Now, just to remove some possible doubt:
Are you sure that it really pushed it by ROW?

Let us create a working test:
   
CREATE TABLE `City_test2` (
`ID` int(11) NOT NULL DEFAULT '0',
`Name` char(35) NOT NULL DEFAULT '',
`CountryCode` char(3) NOT NULL DEFAULT '',
`District` char(20) NOT NULL DEFAULT '',
`Population` int(11) NOT NULL DEFAULT '0',
`satisfaction_grade` char(50) DEFAULT NULL,
`previous_satgrade` varchar(50) DEFAULT NULL,
KEY `satisfaction_grade` (`satisfaction_grade`,`previous_satgrade`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
insert into City_test2 select * from city;
mysql> truncate table City_test2;insert into City_test2 select * from city limit 1000;
Query OK, 0 rows affected (0.24 sec)
Query OK, 1000 rows affected (0.10 sec)
Records: 1000 Duplicates: 0 Warnings: 0

So, running the query, we see the History going up and down, but never to 0 as before. However, if we change the innodb_max_purge_lag, set it to a lower value than the history (say history is 137), and set the lag to 100, in theory there should be the following delay as for instructions:

((purge_lag/innodb_max_purge_lag)×10)--5 milliseconds.

The result should be
((137/100)X10)-5=8.7 ms


mysql> set global innodb_max_purge_lag=100;
Query OK, 0 rows affected (0.00 sec)
mysql> truncate table City_test2;insert into City_test2 select * from city limit 1000;
Query OK, 0 rows affected (0.10 sec)
Query OK, 1000 rows affected (7.40 sec)
Records: 1000 Duplicates: 0 Warnings: 0

Close enough from my point of view, and it increases at the increasing number of rows.

Changing innodb_purge_batch_size, innodb_rollback_segments, and innodb_purge_threads will change how the purge will work but not the delay. Innodb_max_purge_lag is the only possible parameter to use, and it’s not enough.
Conclusions

The new mechanism for the purge is much more flexible and is able to scale. The fact that it’s now separate from the main thread reduces a lot of the negative effects. Nevertheless, the capacity that the undo log purge now has also presents possible risks. Huge pending flushes means possible huge space on disk and/or huge delay.

The delay behavior is not new and was already present MySQL 5.0, but the new capacity brings it to a higher level of risk, especially in 5.5. In 5.6 the purging thread, is much more efficient, and getting a huge history list was difficult, but when I did get it I saw the same behavior.

Whenever we need to tune the purge lag settings, the value needs to be set not in relation to the history list, but in relation to the maximum acceptable delay in the inserts as a cumulative value.