<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>6lab.cz | RSS Feed</title>
	<atom:link href="http://6lab.cz/author/peter-drienko/feed/" rel="self" type="application/rss+xml" />
	<link>http://6lab.cz</link>
	<description>Networking, IPv6, Security</description>
	<lastBuildDate>Tue, 24 Oct 2017 08:54:46 +0000</lastBuildDate>
	<language>en-US</language>
		<sy:updatePeriod>hourly</sy:updatePeriod>
		<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.9.1</generator>
	<item>
		<title>Optimizing network tracking tool for management of BUT campus network</title>
		<link>http://6lab.cz/optimizing-network-tracking-tool-for-management-of-but-campus-network-3/</link>
		<comments>http://6lab.cz/optimizing-network-tracking-tool-for-management-of-but-campus-network-3/#comments</comments>
		<pubDate>Wed, 22 May 2013 07:26:50 +0000</pubDate>
		<dc:creator><![CDATA[Peter Drienko]]></dc:creator>
				<category><![CDATA[Monitoring]]></category>
		<category><![CDATA[Networking]]></category>

		<guid isPermaLink="false">http://6lab.cz/?p=1603</guid>
		<description><![CDATA[This article describes optimization of the database structure in Zabbix monitoring tool using automatic partitioning. With this optimization we achieved a decrease of the processor utilization on the Zabbix server and a significant decrease in the growth of data on its hard drive. Last Updated: 27th of May, 2014. Introduction ... <a href="http://6lab.cz/optimizing-network-tracking-tool-for-management-of-but-campus-network-3/" class="more-link">Read More</a>]]></description>
				<content:encoded><![CDATA[<p>This article describes optimization of the database structure in Zabbix monitoring tool using automatic partitioning. With this optimization we achieved a decrease of the processor utilization on the Zabbix server and a significant decrease in the growth of data on its hard drive.<br />
Last Updated: 27th of May, 2014.</p>
<h2>Introduction</h2>
<p>Zabbix is a semi-distributed, open source monitoring solution with central management. This monitoring solution has been successfully used in the BUT campus area for several months. It has reached the state where optimization of it’s database was needed due to a high amount of new data being inserted to the tables. This article describes how we achieved the optimization of the Zabix’s PostgreSQL database with an installation manual and what results were accomplished.</p>
<h2>2 Description of the BUT network</h2>
<p>The BUT campus network that Zabbix monitors is a 10Gbit backbone network connected to the CESNET academic network. The networks consists mainly of network switches, uninterruptible power sources, network printers and some other elements. Since 2005 the team managing this network also manages the student’s dorm network called KolejNet which consists of more than 7000 end stations. To get a general idea about the size of the network we provide a table with specific information from the Zabbix monitoring client.</p>
<p align="center">Table 1: Key performance indicators of the BUT Zabbix server</p>
<p style="text-align: center;"><a href="http://6lab.cz/new/wp-content/uploads/2013/05/Untitled.png"><img class="aligncenter size-full wp-image-1624" src="http://6lab.cz/new/wp-content/uploads/2013/05/Untitled.png" alt="" width="493" height="108" /></a></p>
<p>As you can see from the table 1, approximately 32 new values are added to the tables every second. With the current way of storing the data, this means that the biggest table in the database has more than 530 millions of rows and the total size of the whole database is nearly 120 GBs. This makes working with such a huge database very time and performance consuming.</p>
<h2>3 State of the BUT Zabbix server before optimization</h2>
<p>The current way of maintaining this huge database is quite simple. Perform a series of DELETE queries on the table rows and  delete those that match the criteria of being older than 365 days. There are two problems with this approach:</p>
<ol>
<li>SQL query DELETE does not free up the space on hard drive after deleting the rows due to the Multiversion concurrency control in PostgreSQL</li>
<li>Using DELETE on so many rows increases the CPU utilization and slows down the system</li>
</ol>
<p>As we can see on the picture 1, the size of the database on the hard drive is constantly increasing, even though we are performing DELETE queries on old rows.</p>
<p style="text-align: center;"><a href="http://6lab.cz/new/wp-content/uploads/2013/05/Untitled1.png"><img class="alignnone size-full wp-image-1630" src="http://6lab.cz/new/wp-content/uploads/2013/05/Untitled1.png" alt="" width="608" height="271" /></a></p>
<p align="center"> Picture 1: The size of the PostgreSQL database before optimization</p>
<p>Picture 2 shows how periodic deleting of rows using the DELETE query affects CPU utilization (increase of the CPU load marked with red arrows).</p>
<p style="text-align: center;" align="center"><a href="http://6lab.cz/new/wp-content/uploads/2013/05/Untitled2.png"><img class="alignnone size-full wp-image-1631" src="http://6lab.cz/new/wp-content/uploads/2013/05/Untitled2.png" alt="" width="608" height="329" /></a></p>
<p style="text-align: center;" align="center">Picture 2: Increase of the CPU utilization when using DELETE queries on many rows</p>
<p>We&#8217;ve also taken a look at how the PostgreSQL planner plans queries before the optimization. The following query:</p>
<pre>EXPLAIN ANALYZE SELECT *
FROM history
WHERE clock &gt; 1394841600 AND clock &lt; 1395014400</pre>
<p>will be executed the following way:</p>
<pre>QUERY PLAN
--------------------------------------------------------------
Seq Scan on history  (cost=0.00..926510.60 rows=2839499 width=21) (actual time=636.116..5824.925 rows=2858660 loops=1)
  Filter: ((clock &gt; 1394841600) AND (clock &lt; 1395014400))
  Rows Removed by Filter: 40412381
Total runtime: 5928.246 ms</pre>
<p>We can see that the planner uses a sequential scan on the whole table which takes almost 6 seconds to return the results.</p>
<h2>4 Optimization proposal</h2>
<p>The main idea of optimization is to replace the function which deletes rows using the DELETE query with a function which firstly, changes the structure of the database and secondly, doesn’t use DELETE queries at all.</p>
<p>The proposed approach is to partition the huge parent tables into smaller child partitions as shown in the picture 3.</p>
<p style="text-align: center;"><a href="http://6lab.cz/new/wp-content/uploads/2013/05/Untitled3.png"><img class="alignnone size-full wp-image-1632" src="http://6lab.cz/new/wp-content/uploads/2013/05/Untitled3.png" alt="" width="434" height="259" /></a></p>
<p style="text-align: center;" align="center">Picture 3: New structure of the database</p>
<p>We&#8217;ve compared two existing partitioning solutions for Zabbix:<br />
[1] requires an empty database<br />
[2] does not require an empty database</p>
<p>We decided to implement the second solution mainly because:</p>
<ul>
<li> when a new value is being added, if the partition where the new value should be stored doesn&#8217;t exists, it will be created automatically</li>
<li>it does not require an empty database</li>
<li>the code is simpler than the first version</li>
<li>it does not use any emergency or temporary tables like the first version</li>
<li>it contains a function to delete old partitions</li>
</ul>
<h2>5 Implementation</h2>
<p>We will implement this optimization using these five steps:</p>
<ol>
<li>Creation of the partitioning function</li>
<li>Setting triggers for every table</li>
<li>Creation of the function which will be deleting partitions</li>
<li>Migrating of the old records to the new partitions</li>
</ol>
<p>The implemented solution uses the procedural language pgSQL. To install these functions to your PostgreSQL server, just run them as you would any other SQL query.</p>
<p><strong>Creation of the partitioning function</strong></p>
<pre>CREATE SCHEMA partitions
AUTHORIZATION zabbix;</pre>
<pre>CREATE OR REPLACE FUNCTION trg_partition()
RETURNS TRIGGER AS
$BODY$
DECLARE
prefix text := 'partitions.';
timeformat text;
selector text;
_interval INTERVAL;
tablename text;
startdate text;
enddate text;
create_table_part text;
create_index_part text;

BEGIN
selector = TG_ARGV[0];

IF selector = 'day' THEN
timeformat := 'YYYY_MM_DD';
ELSIF selector = 'month' THEN
timeformat := 'YYYY_MM';
END IF;

_interval := '1 ' || selector;
tablename :=  TG_TABLE_NAME || '_p' || TO_CHAR(TO_TIMESTAMP(NEW.clock), timeformat);

EXECUTE 'INSERT INTO ' || prefix || quote_ident(tablename) || ' SELECT ($1).*' USING NEW;

RETURN NULL;

EXCEPTION
WHEN undefined_table THEN

PERFORM delete_partitions(quote_ident(tablename));

startdate := EXTRACT(epoch FROM date_trunc(selector, TO_TIMESTAMP(NEW.clock)));
enddate := EXTRACT(epoch FROM date_trunc(selector, TO_TIMESTAMP(NEW.clock) + _interval ));

create_table_part:= 'CREATE TABLE IF NOT EXISTS '|| prefix || quote_ident(tablename) || ' (CHECK ((clock &gt;= '
|| quote_literal(startdate) || ' AND clock &lt; ' || quote_literal(enddate) || '))) INHERITS ('|| TG_TABLE_NAME || ')';

IF TG_TABLE_NAME = 'events' THEN
	create_index_part:= 'CREATE INDEX '|| quote_ident(tablename) || '_1 on ' || prefix
|| quote_ident(tablename) || '(eventid,clock)';
ELSE
	create_index_part:= 'CREATE INDEX '|| quote_ident(tablename) || '_1 on ' || prefix
|| quote_ident(tablename) || '(itemid,clock)';
END IF;

EXECUTE create_table_part;
EXECUTE create_index_part;

EXECUTE 'INSERT INTO ' || prefix || quote_ident(tablename) || ' SELECT ($1).*' USING NEW;

RETURN NULL;
END;
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100;
ALTER FUNCTION create_partition()
OWNER TO postgres;</pre>
<p><strong>Setting triggers for every table</strong></p>
<pre>CREATE TRIGGER partition_trg BEFORE INSERT ON history FOR EACH ROW EXECUTE PROCEDURE trg_partition('day');
CREATE TRIGGER partition_trg BEFORE INSERT ON history_text FOR EACH ROW EXECUTE PROCEDURE trg_partition('day');
CREATE TRIGGER partition_trg BEFORE INSERT ON history_str  FOR EACH ROW EXECUTE PROCEDURE trg_partition('day');
CREATE TRIGGER partition_trg BEFORE INSERT ON history_text FOR EACH ROW EXECUTE PROCEDURE trg_partition('month');
CREATE TRIGGER partition_trg BEFORE INSERT ON events FOR EACH ROW EXECUTE PROCEDURE trg_partition('day');
CREATE TRIGGER partition_trg BEFORE INSERT ON trends FOR EACH ROW EXECUTE PROCEDURE trg_partition('day');
CREATE TRIGGER partition_trg BEFORE INSERT ON trends_uint FOR EACH ROW EXECUTE PROCEDURE trg_partition('day');</pre>
<p><strong>Creation of the function which will be deleting partitions</strong><br />
While the existing solution uses a more complicated function which uses cron to delete old partitions, we propose a new function which will use a new table &#8220;partition_life&#8221;.<br />
In the new table (Picture 4) we&#8217;ll store how long should the partitions for each parent table stay undeleted. If you don&#8217;t want to delete the partitions of a particular table, set -1 to the days column. This way you don&#8217;t need to use cron to automatically delete partitions, it&#8217;s pure pgSQL.</p>
<p style="text-align: center;"><a href="http://6lab.cz/new/wp-content/uploads/2013/05/x3.png"><img class="aligncenter size-full wp-image-1895" src="http://6lab.cz/new/wp-content/uploads/2013/05/x3.png" alt="" width="175" height="236" /></a></p>
<p align="center"> Picture 4: New table partition_life</p>
<pre>CREATE OR REPLACE FUNCTION delete_partitions(partitionname  text)
  RETURNS text AS
$BODY$
DECLARE
result RECORD;
prefix text := 'partitions.';
table_timestamp TIMESTAMP;
delete_before_timestamp TIMESTAMP;
v_tablename text;
v_max_partition_age integer := -1;

BEGIN
   -- (?=_p) Positive Lookahead
   v_tablename := substring(partitionname  FROM '^([a-z_]*)(?=_p)');
   -- Find the maximum partition age for current table
   SELECT days INTO v_max_partition_age FROM partition_life WHERE name = v_tablename;
   -- If selected table is set to have old partitions deleted
   IF (v_max_partition_age  -1) THEN
      delete_before_timestamp := TO_TIMESTAMP(substring(partitionname FROM '[0-9_]*$'), 'YYYY_MM_DD') -
      (v_max_partition_age * INTERVAL '1 day');
      FOR result IN SELECT * FROM pg_tables WHERE schemaname = 'partitions' AND pg_tables.tablename LIKE
         (v_tablename || '_p%') LOOP
         table_timestamp := TO_TIMESTAMP(substring(result.tablename FROM '[0-9_]*$'), 'YYYY_MM_DD');
         IF table_timestamp &lt;= delete_before_timestamp THEN
            RAISE NOTICE 'Deleting table %', quote_ident(result.tablename);
	    EXECUTE 'DROP TABLE ' || prefix || quote_ident(result.tablename) || ';';
         END IF;
      END LOOP;
   END IF;
RETURN delete_before_timestamp;
END;
$BODY$
  LANGUAGE plpgsql VOLATILE
  COST 100;
ALTER FUNCTION delete_partitions(text)
  OWNER TO postgres;</pre>
<p><strong>Migrating of the old records to the new partitions</strong></p>
<pre>INSERT INTO history SELECT * FROM history;</pre>
<p>Once you have ran these queries in your SQL server, the new partitions should start appearing under the partitions schema.</p>
<h2>6 Conclusion</h2>
<p>After the optimization we reviewed the graphs once again to see if there were any improvements in the CPU utilization and disk size.</h2>
<p style="text-align: center;"><a href="http://6lab.cz/new/wp-content/uploads/2013/05/Untitled4.png"><img class="alignnone size-full wp-image-1633" src="http://6lab.cz/new/wp-content/uploads/2013/05/Untitled4.png" alt="" width="608" height="341" /></a></p>
<p align="center">Picture 5: Database size after the optimization</p>
<p>In the picture 5 we can see that the disk size increasing has been significantly reduced (1 GB vs 400 MB in 4 days) and that the increased CPU utilization is no longer present in picture 6 as the new approach uses DROP queries for deleting partitions instead of DELETE for every row.</p>
<p align="center"><a href="http://6lab.cz/new/wp-content/uploads/2013/05/Untitled5.png"><img class="alignnone size-full wp-image-1634" src="http://6lab.cz/new/wp-content/uploads/2013/05/Untitled5.png" alt="" width="609" height="297" /></a></p>
<p align="center">Picture 6: CPU utilization after the optimization</p>
<p>We also reviewed how the PostgreSQL planner plans queries after the optimization. We ran the same query again:</p>
<pre>EXPLAIN ANALYZE SELECT *
FROM history
WHERE clock &gt; 1394841600 AND clock &lt; 1395014400</pre>
<p>The result was:</p>
<pre>QUERY PLAN
--------------------------------------------------------------
 -&gt;  Append  (cost=0.00..64049.25 rows=2858109 width=21) (actual time=0.101..766.664 rows=2858660 loops=1)
   	-&gt;  Seq Scan on history  (cost=0.00..10.15 rows=1 width=21) (actual time=0.089..0.089 rows=0 loops=1)
         	Filter: ((clock &gt; 1394841600) AND (clock &lt; 1395014400))
         	Rows Removed by Filter: 410
   	-&gt;  Seq Scan on history_p2014_03_15 history  (cost=0.00..31506.61 rows=1406096 width=21) (actual time=0.012..274.342
rows=1406361 loops=1)
         	Filter: ((clock &gt; 1394841600) AND (clock &lt; 1395014400))
         	Rows Removed by Filter: 13
   	-&gt;  Seq Scan on history_p2014_03_16 history  (cost=0.00..32532.49 rows=1452012 width=21) (actual time=0.019..271.521
rows=1452299 loops=1)
         	Filter: ((clock &gt; 1394841600) AND (clock &lt; 1395014400))

Total runtime: 1205.666 ms
</pre>
<p>This proves that the planner knows about the CHECK constraints on our partitions and it only scans the partitions which have the CHECK constraints in the specified interval in our query.<br />
The result of the query was therefore returned in 1.2 seconds which is almost 6x times faster than before the optimization (5.9 seconds).</p>
<h2>7 Bibliography</h2>
<p align="left">[1] PostgreSQL Database Partitioning with Zabbix 2 [online]. [cit. 13. 05. 2014]. <a title="https://www.zabbix.org/wiki/Docs/howto/zabbix2_postgresql_partitioning" href="https://www.zabbix.org/wiki/Docs/howto/zabbix2_postgresql_partitioning">https://www.zabbix.org/wiki/Docs/howto/zabbix2_postgresql_partitioning</a><br />
[2] Auto Partitioning with Zabbix 2.0 and Postgresql [online]. [cit. 17. 05. 2014]. <a title="https://www.zabbix.org/wiki/Docs/howto/zabbix2_postgresql_autopartitioning" href="https://www.zabbix.org/wiki/Docs/howto/zabbix2_postgresql_autopartitioning">https://www.zabbix.org/wiki/Docs/howto/zabbix2_postgresql_autopartitioning</a>.</p>
<div  class="x-author-box cf" ><h6 class="h-about-the-author">About the main author</h6><div class="x-author-info"><h4 class="h-author mtn">Peter Drienko</h4><a href="http://www.fit.vutbr.cz" class="x-author-social" title="Visit the website for Peter Drienko" target="_blank"><i class="x-icon-globe"></i> http://www.fit.vutbr.cz</a><span class="x-author-social"><i class="x-icon-envelope"></i> drienkop@gmail.com</span><p class="p-author mbn">Brno University of Technology student.</p></div></div>
]]></content:encoded>
			<wfw:commentRss>http://6lab.cz/optimizing-network-tracking-tool-for-management-of-but-campus-network-3/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
