{"id":1775,"date":"2014-03-23T21:44:32","date_gmt":"2014-03-23T21:44:32","guid":{"rendered":"http:\/\/davstott.me.uk\/?p=1775"},"modified":"2014-03-23T21:44:32","modified_gmt":"2014-03-23T21:44:32","slug":"saving-the-world-one-cpu-cycle-at-a-time","status":"publish","type":"post","link":"https:\/\/davstott.me.uk\/index.php\/2014\/03\/23\/saving-the-world-one-cpu-cycle-at-a-time\/","title":{"rendered":"Saving the world, one cpu cycle at a time"},"content":{"rendered":"<p>Energy is one of the biggest problems facing the world today, with an increasing amount of it being converted into compute cycles and heat in Warehouse Scale Computers in data centres around the world. In recent years, the big tech companies have invested an awful lot of time from some very clever people into squeezing more performance out of the servers that drive the products that many of us now take for granted. Indeed, one paper I read recently said that the popular Gmail service wouldn&#8217;t have been close to affordable to run without Google&#8217;s power efficiency in software and data centre design.<\/p>\n<p>My favourite side-effect of this renewed drive for efficiency is that this also gives us the tools to do lots of fun things on relatively small computers, and make our servers use up less electricity to perform the same tasks. A bit like replacing the air filter on a car engine to let it breathe more easily, each individual tweak might only save a few microseconds here and there, but it can soon add up.<\/p>\n<p>We can\u00e2\u20ac\u2122t all be <a href=\"https:\/\/plus.google.com\/+KentonVarda\/posts\/TSDhe5CvaFe\">Jeff Dean<\/a>, and need more than <a href=\"http:\/\/tinyhack.com\/2014\/03\/12\/implementing-a-web-server-in-a-single-printf-call\/\">a printf() call<\/a> to implement a HTTP server, so I&#8217;m going to start with a simple technique I used to stop my little Bytemark VM having to do some tasks that just aren&#8217;t necessary at all..<\/p>\n<h2>Stopping dictionary attacks on wp-login.php with a couple of lines of nginx configuration<\/h2>\n<p>One downside of using a popular content management system like WordPress is that many script kiddies like to perform automated attacks on it, looking for people who&#8217;ve chosen poor usernames and passwords to edit their content. Unfortunately, that sort of attack causes web servers to get very busy, using up power and generally slowing things down for people trying to do real work. One such attack is illustrated in the chart below, showing my server&#8217;s CPU activity over 24 hours from a week or two ago. The data came from sar&#8217;s standard logs.<\/p>\n<div id=\"attachment_1774\" style=\"width: 635px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/davstott.me.uk\/wordpress\/wp-content\/uploads\/2014\/03\/cpuChart13.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-1774\" src=\"http:\/\/davstott.me.uk\/wordpress\/wp-content\/uploads\/2014\/03\/cpuChart13-1024x526.png\" alt=\"A chart of my server&#039;s CPU activity, the % in use\" width=\"625\" height=\"321\" class=\"size-large wp-image-1774\" srcset=\"https:\/\/davstott.me.uk\/wordpress\/wp-content\/uploads\/2014\/03\/cpuChart13-1024x526.png 1024w, https:\/\/davstott.me.uk\/wordpress\/wp-content\/uploads\/2014\/03\/cpuChart13-300x154.png 300w, https:\/\/davstott.me.uk\/wordpress\/wp-content\/uploads\/2014\/03\/cpuChart13-624x320.png 624w, https:\/\/davstott.me.uk\/wordpress\/wp-content\/uploads\/2014\/03\/cpuChart13.png 1191w\" sizes=\"auto, (max-width: 625px) 100vw, 625px\" \/><\/a><p id=\"caption-attachment-1774\" class=\"wp-caption-text\">A chart of my server&#8217;s CPU activity, the % in use<\/p><\/div>\n<p>What could cause such a big plateau of busywork, I hear you ask? It turned out to be a couple of IP addresses with very dodgy DNS names blatting away at my wp-login.php several times a second for hours on end.<\/p>\n<p>I made this traffic go away with a couple of simple <a href=\"http:\/\/nginx.org\/en\/docs\/http\/ngx_http_limit_req_module.html\">nginx configuration<\/a> directives that made the attacking script believe that my website had gone away after a handful of failed log in attempts. <\/p>\n<p>It&#8217;s split into two sections, the first is to add this to nginx.conf\u00e2\u20ac\u2122s http{} section. This creates a bucket named \u00e2\u20ac\u02dcmyZone\u00e2\u20ac\u2122, with a rule that says any content locations mapped to that bucket should not average more than one request per second from a given remote IP address:<\/p>\n<p><code> limit_req_zone $binary_remote_addr zone=myZone:2m rate=1r\/s;<\/code><\/p>\n<p>Adding this next stansa into your virtual host\u00e2\u20ac\u2122s server{} section (sorry for using the ~ operator, lazy Dav was lazy) maps any URLs containing the string \u00e2\u20ac\u02dcwp-login.php\u00e2\u20ac\u2122 into that bucket. The \u00e2\u20ac\u0153burst=3\u00e2\u20ac\u009d parameter allows me to retry a failed login a couple of times without locking me out of my own website, and consigning error logging to \/dev\/null means that I don\u00e2\u20ac\u2122t waste disk space or webstats on looking at the banned login requests, although there are some downsides to doing that :<\/p>\n<p><code> location ~ wp-login.php {<br \/>\n      \tlimit_req zone=myZone burst=3;<br \/>\n      \tfastcgi_pass php;<br \/>\n      \tinclude fastcgi_params;<br \/>\n      \terror_log \/dev\/null;<br \/>\n  }<\/code><\/p>\n<p>Whilst this deflects a simple attack from a small number of IP addresses, it wouldn\u00e2\u20ac\u2122t help so much if my server was that target of a distributed attack from a large number of IP addresses. This is where cloud-based services such as <a href=\"http:\/\/wordpress.org\/plugins\/bruteprotect\/\">Brute Protect<\/a> come into play, letting server administrators share data on failed login attempts into a coordinated pool to try to protect everybody. This plugin runs in PHP, so isn\u00e2\u20ac\u2122t as efficient as blocking the unwanted traffic in the web server or firewall, but it\u00e2\u20ac\u2122s still useful and doesn\u00e2\u20ac\u2122t take much time at all to set up.<\/p>\n<h2>Upgrade your PHP and use Zend\u00e2\u20ac\u2122s opcache<\/h2>\n<p>If you haven\u00e2\u20ac\u2122t heard about op-code caches before, I recommend that you read <a href=\"https:\/\/support.cloud.engineyard.com\/entries\/26902267-PHP-Performance-I-Everything-You-Need-to-Know-About-OpCode-Caches\">EngineYard\u00e2\u20ac\u2122s excellent explanation of PHP Opcode Caches<\/a>, which includes any amount of detail.<\/p>\n<p>From what I remember of <a href=\"https:\/\/twitter.com\/julienPauli\">@julienPauli<\/a>\u00e2\u20ac\u2122s excellent talk on <a href=\"http:\/\/www.slideshare.net\/jpauli\/yoopee-cache-op-cache-internals\">OpCache internals<\/a> at <a href=\"https:\/\/joind.in\/10708\">this year<\/a>&#8216;s <a href=\"http:\/\/phpconference.co.uk\/\">PHP London conference<\/a>, the recently open-sourced opcache works wonders when using frameworks that compile a large number of classes that aren&#8217;t used all that often. Rather like my WordPress site.<\/p>\n<p>Unlike APC, Opcache doesn\u00e2\u20ac\u2122t just cache the results of compiling your PHP code, it also includes some optimisations, such as replacing i++ with ++i to avoid having to allocate a temporary variable that is never referenced to store the results of the increment. Taken together, these optimisations become significant.<\/p>\n<p>To look at the impact of upgrading to php5.5 and switching on opcache, I\u00e2\u20ac\u2122ve charted the difference in the time taken to draw the front page of my blog:<\/p>\n<p><script type=\"text\/javascript\" src=\"https:\/\/www.google.com\/jsapi\"><\/script><br \/>\n\t<script type=\"text\/javascript\">\n  \tgoogle.load(\"visualization\", \"1\", {packages:[\"corechart\"]});\n  \tgoogle.setOnLoadCallback(drawVisualization);\nfunction drawVisualization() {\n  var data = google.visualization.arrayToDataTable([\n\t['server', 'PHP 5.5', 'PHP 5.5 with Opcache'],\n\t['server', 10.64,  3.68],\n  ]);\n  new google.visualization.BarChart(document.getElementById('chart_div')).\n  \tdraw(data,\n       \t{title:\"Time taken to complete http request (shorter is better)\",\n        \twidth:500, height:250,\n        \thAxis: {title: \"Millseconds\"}}\n  \t);\n}\n<\/script><\/p>\n<div id=\"chart_div\" style=\"width: 500px; height: 250px;\">&nbsp;<\/div>\n<p>I generated my numbers completely unscientifically by blatting away for a bit at an offline copy of my website using the excellent <a href=\"http:\/\/www.joedog.org\/siege-home\/\">Siege load testing tool<\/a>. Almost all of my home computers (maybe not my Raspberry Pi) are way more powerful than my Bytemark VM, and anyway, it\u00e2\u20ac\u2122s only the relative difference between them that\u00e2\u20ac\u2122s interesting. Enabling opcache means I got a speed increase of about 300% for no extra cost. Nice. <\/p>\n<p>If you&#8217;d like a more sensible comparison of PHP versions and op-cache options, this is a good place to start: <a href=\"http:\/\/www.ricardclau.com\/2013\/03\/apc-vs-zend-optimizer-benchmarks-with-symfony2\/\">http:\/\/www.ricardclau.com\/2013\/03\/apc-vs-zend-optimizer-benchmarks-with-symfony2\/<\/a><\/p>\n<h2>Use a better malloc(3)<\/h2>\n<p>By and large, you get the biggest speed and efficiency increases by improving the code you\u00e2\u20ac\u2122ve written for your application and looking at database indexes and structures, but you can get some interesting results by making a very small improvement to code that is called a great many times over a period of months or even years. One such piece of code is the standard C library function that allocates a section of memory for your application, <a href=\"http:\/\/msdn.microsoft.com\/en-us\/library\/6ewkz86d.aspx\">malloc()<\/a>.<\/p>\n<p>I\u00e2\u20ac\u2122m not going to dwell on this in too much detail, largely because many other people have already done so, but there are two popular alternatives to glibc\u00e2\u20ac\u2122s malloc(): FreeBSD\u00e2\u20ac\u2122s <a href=\"http:\/\/www.canonware.com\/jemalloc\/\">jemalloc<\/a> and Google\u00e2\u20ac\u2122s <a href=\"http:\/\/goog-perftools.sourceforge.net\/doc\/tcmalloc.html\">tcmalloc<\/a>.<\/p>\n<p><a href=\"https:\/\/www.facebook.com\/notes\/facebook-engineering\/scalable-memory-allocation-using-jemalloc\/480222803919\">Facebook has reported<\/a> some very good results from using jemalloc for general use as well as <a href=\"https:\/\/www.facebook.com\/notes\/mysql-at-facebook\/using-jemalloc-to-fix-a-performance-problem\/10150494400690933\">speeding up MySQL with it<\/a>, my favourite key-value store Redis <a href=\"http:\/\/oldblog.antirez.com\/post\/everything-about-redis-24\">uses jemalloc<\/a> and it became the <a href=\"http:\/\/blog.pavlov.net\/2008\/03\/11\/firefox-3-memory-usage\/\">allocator of choice for Mozilla<\/a> starting with Firefox version 3.<\/p>\n<p>That said, apart from Google\u00e2\u20ac\u2122s own use of it, <a href=\"https:\/\/github.com\/blog\/1422-tcmalloc-and-mysql\">GitHub used tcmalloc<\/a> to sort out their MySQL servers to significantly reduce the time <a href=\"http:\/\/jamesgolick.com\/2012\/7\/18\/innodb-kernel-mutex-contention-and-memory-allocators.html\">InnoDB spent waiting for kernel_mutex<\/a>. <\/p>\n<p>Whilst I was reading about this, I tripped over the excellent <a href=\"http:\/\/poormansprofiler.org\/\">poor man\u00e2\u20ac\u2122s profiler<\/a> approach to performance profiling, using the standard Linux tools to aggregate a collection of stack traces.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Energy is one of the biggest problems facing the world today, with an increasing amount of it being converted into compute cycles and heat in Warehouse Scale Computers in data centres around the world. In recent years, the big tech companies have invested an awful lot of time from some very clever people into squeezing [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9],"tags":[],"class_list":["post-1775","post","type-post","status-publish","format-standard","hentry","category-tech"],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/davstott.me.uk\/index.php\/wp-json\/wp\/v2\/posts\/1775","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/davstott.me.uk\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/davstott.me.uk\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/davstott.me.uk\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/davstott.me.uk\/index.php\/wp-json\/wp\/v2\/comments?post=1775"}],"version-history":[{"count":15,"href":"https:\/\/davstott.me.uk\/index.php\/wp-json\/wp\/v2\/posts\/1775\/revisions"}],"predecessor-version":[{"id":1790,"href":"https:\/\/davstott.me.uk\/index.php\/wp-json\/wp\/v2\/posts\/1775\/revisions\/1790"}],"wp:attachment":[{"href":"https:\/\/davstott.me.uk\/index.php\/wp-json\/wp\/v2\/media?parent=1775"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/davstott.me.uk\/index.php\/wp-json\/wp\/v2\/categories?post=1775"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/davstott.me.uk\/index.php\/wp-json\/wp\/v2\/tags?post=1775"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}