<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/"><channel><title><![CDATA[a star shines upon the hour of our meeting]]></title><description><![CDATA[No Description]]></description><link>https://justin.palpant.us/</link><image><url>https://justin.palpant.us/favicon_128.png</url><title>a star shines upon the hour of our meeting</title><link>https://justin.palpant.us/</link></image><generator>Jamify 1.0</generator><lastBuildDate>Wed, 08 Apr 2026 00:02:58 GMT</lastBuildDate><atom:link href="https://justin.palpant.us/rss/" rel="self" type="application/rss+xml"/><ttl>60</ttl><item><title><![CDATA[Tuning cluster-autoscaler on GKE]]></title><description><![CDATA[GKE makes Kubernetes easy to manage on GCP, but its autoscaling node pools can be slow to scale down, leading to increased cluster costs. Learn a few tips to cut down on scale-down times and get the most for your money.]]></description><link>https://justin.palpant.us/tuning-cluster-autoscaler-on-gke/</link><guid isPermaLink="false">Ghost__Post__5eea4fbaf0526d0006a08374</guid><category><![CDATA[kubernetes]]></category><category><![CDATA[gke]]></category><category><![CDATA[cluster-autoscaler]]></category><category><![CDATA[infrastructure]]></category><dc:creator><![CDATA[Justin Palpant]]></dc:creator><pubDate>Wed, 17 Jun 2020 22:21:58 GMT</pubDate><media:content url="https://justin.palpant.us/static/204935d7c297b4165f302ec588d15af5/1000px-Kubernetes-Engine-Logo.png" medium="image"/><content:encoded><![CDATA[<img src="https://justin.palpant.us/static/204935d7c297b4165f302ec588d15af5/1000px-Kubernetes-Engine-Logo.png" alt="Tuning cluster-autoscaler on GKE"/><p>I run a small GKE cluster to host <a href="https://gitlab.palpant.us/justin/palpantlab-infra?ref=ghost.justin.palpant.us#what-is-it">a number of personal projects</a>, including this blog. Since this cluster only runs personal projects, I want to keep it as small as possible to keep costs down, but still make sure everything I run has the resources it needs to be as performant as it needs to be.</p><p>In Kubernetes, <a href="https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/?ref=ghost.justin.palpant.us">resource limits and requests</a> give the cluster the information it needs about each pod to assign pods to nodes while making sure those nodes don't get too crowded, or contend for resources. Nodes automatically register the amount of CPU and memory they have available for pods, and Kubernetes allocates pods to nodes until these resources are consumed. When all nodes are out of resources, no additional pods can be scheduled, and pods <a href="https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/?ref=ghost.justin.palpant.us">may be evicted</a> to allow higher-priority pods to be scheduled.</p><p>Many cloud platforms, however, support deploying the <a href="https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler?ref=ghost.justin.palpant.us#cluster-autoscaler">cluster autoscaler</a> to automatically create new Kubernetes nodes when more resources are requests. Google Cloud supports this <a href="https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-autoscaler?ref=ghost.justin.palpant.us">as an add-on for GKE clusters</a>, and individual node pools can then be configured with autoscaling enabled or disabled, with autoscaling limits for each node pool. </p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler?ref=ghost.justin.palpant.us#cluster-autoscaler"><div class="kg-bookmark-content"><div class="kg-bookmark-title">kubernetes/autoscaler</div><div class="kg-bookmark-description">Autoscaling components for Kubernetes. Contribute to kubernetes/autoscaler development by creating an account on GitHub.</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://github.githubassets.com/favicons/favicon.svg" alt="Tuning cluster-autoscaler on GKE"><span class="kg-bookmark-author">GitHub</span><span class="kg-bookmark-publisher">kubernetes</span></img></div></div><div class="kg-bookmark-thumbnail"><img src="https://avatars1.githubusercontent.com/u/13629408?s=400&#x26;v=4" alt="Tuning cluster-autoscaler on GKE"/></div></a></figure><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://ghost.justin.palpant.us/content/images/2020/06/image.png" class="kg-image" alt="Tuning cluster-autoscaler on GKE" loading="lazy"><figcaption>A Kubernetes node pool with autoscaling enabled, allowing 0-8 nodes to be created by the cluster-autoscaler</figcaption></img></figure><p>But while configuring autoscaling on GKE is simple and clean, the autoscaler isn't particularly eager to scale down. This is a common problem with cloud providers, not just the GKE cluster-autoscaler. For large companies, it sometimes makes sense to write your own code to scale down, as Sony Imageworks did when launching their hybrid-cloud renderfarm in GCP:</p><figure class="kg-card kg-embed-card kg-card-hascaption"><iframe width="480" height="270" src="https://www.youtube.com/embed/ODOJ3UbnV6Y?start=1842&#x26;feature=oembed" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""/><figcaption>Sony Imageworks learned that automatic autoscaling was too slow on GCP, and chose to manage it manually</figcaption></figure><p>For the rest of us, it may be too costly or complicated to take over autoscaling from Kubernetes. Fortunately, there are a few ways to help the autoscaler along and cut costs.</p><h2 id="ask-nicely">Ask nicely</h2><p>The cluster autoscaler on any GKE cluster has a number of configuration options available - some of which are available only via the API or <code class="language-text">gcloud</code> CLI, or on <code class="language-text">alpha</code> or <code class="language-text">beta</code> API endpoints. The full set of options available on the CLI can be found at the gcloud SDK documentation for <code><a href="https://cloud.google.com/sdk/gcloud/reference/container/clusters?ref=ghost.justin.palpant.us">gcloud container clusters</a></code>.</p><p>One relevant option is found in the beta command: <a href="https://cloud.google.com/kubernetes-engine/docs/concepts/cluster-autoscaler?ref=ghost.justin.palpant.us#autoscaling_profiles">autoscaling profiles</a></p><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">gcloud beta container clusters update example-cluster <span class="token punctuation">\</span>
--autoscaling-profile optimize-utilization</code></pre></div><p>Setting the cluster's autoscaling-profile to <code class="language-text">optimize-utilization</code> instead of the default value <code class="language-text">balanced</code> will cause the autoscaler to prefer to scale down nodes quickly whenever possible. Google doesn't go into the specifics of the implementation of this profile, but does leave this note of advice:</p><blockquote>This profile has been optimized for use with batch workloads that are not sensitive to start-up latency. We do not currently recommend using this profile with serving workloads.</blockquote><h2 id="pod-tuning-with-autoscaler-events">Pod tuning with autoscaler events</h2><p>cluster-autoscaler is a process like any other, and on many Kubernetes variants, it runs on the cluster, possibly on the master node, as a Pod. This is not the case for GKE, which hides the implementation details of the cluster master, which makes the autoscaler's logs inaccessible, and leaves you unable to configure many of the <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/config/autoscaling_options.go?ref=ghost.justin.palpant.us">options available in the upstream</a>.</p><p>Fortunately, GKE does expose a custom interface for understanding why the autoscaler is making decisions to scale up or down nodes: <a href="https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-autoscaler-visibility?ref=ghost.justin.palpant.us#top_of_page">autoscaler events in Cloud Logging</a>. These events are available for GKE versions starting at 1.15.4-gke.7, with the noScaleDown event type, the most recent at time of writing, being added in 1.16.8-gke.2. For cost savings, this last event is the most relevant, since it tells you why a node <em>wasn't</em> removed, so the rest of this post will assume you are using GKE 1.16.8-gke.2 or later.</p><p>The linked page gives a good guide on how to view these events via Cloud Logging. For each log item, the most relevant information is found in log field <code class="language-text">jsonPayload.noDecisionStatus.noScaleDown.nodes[].reason.messageId</code>, which will be one of <a href="https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-autoscaler-visibility?ref=ghost.justin.palpant.us#noscaledown-reasons">these enumerated items</a>. Here are some common issues the exploring the autoscaler's log events revealed on my cluster.</p><h3 id="kube-system">kube-system</h3><p>Though the docs are pretty clear, I was unaware that cluster-autoscaler will by default <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md?ref=ghost.justin.palpant.us#how-to-set-pdbs-to-enable-ca-to-move-kube-system-pods"/><a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md?ref=ghost.justin.palpant.us#how-to-set-pdbs-to-enable-ca-to-move-kube-system-pods">refuse to evict non-Daemonset Pods in the <code class="language-text">kube-system</code> namespace</a>. When most of these pods are system daemons and they are few compared to your service pods, this is likely a small concern. </p><p>But on a default GKE cluster, especially a small one, there are a surprising number of non-evictable pods in the <code class="language-text">kube-system</code> namespace, and each one can prevent scale down for a node to which it is assigned, no matter how low the utilization. A few such pods that are likely present on your clusters:</p><ul><li>kube-dns autoscaler</li><li>calico-node-vertical-autoscaler</li><li>calico-typha</li><li>calico-typha-horizontal-autoscaler</li><li>calico-typha-vertical-autoscaler</li><li>metrics-server</li></ul><p>For my cluster, on top of these, I had unwisely deployed even more pods to this sytem-critical namespace:</p><ul><li>Helm v2's Tiller pod</li><li>NGINX ingress controller pods</li><li>NGINX ingress default backend pod</li><li><a href="https://github.com/estafette/estafette-gke-preemptible-killer?ref=ghost.justin.palpant.us">estafette's GKE preemptible node killer daemon</a></li></ul><p>All together, these pods blocked downscaling a significant proportion of the time.</p><p>These events can be identified by the <code class="language-text">messageId</code> <code class="language-text">"no.scale.down.node.pod.kube.system.unmovable"</code> - to query only events like this, you could add the following log query line to your filter:</p><div class="kg-card kg-code-card gatsby-highlight" data-language="text"><pre class="language-text"><code class="language-text">jsonPayload.noDecisionStatus.noScaleDown.nodes.reason.messageId = "no.scale.down.node.pod.kube.system.unmovable"</code></pre></div><p>The majority of these pods can be considered low-priorty control plane pods, which are safe to evict. And the <a href="https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md?ref=ghost.justin.palpant.us#how-to-set-pdbs-to-enable-ca-to-move-kube-system-pods">Cluster Autoscaler FAQ</a> provides this advice on how to denote the pods as safe to evict:</p><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">kubectl create poddisruptionbudget <span class="token operator">&#x3C;</span>pdb name<span class="token operator">></span> --namespace<span class="token operator">=</span>kube-system --selector <span class="token assign-left variable">app</span><span class="token operator">=</span><span class="token operator">&#x3C;</span>app name<span class="token operator">></span> --max-unavailable <span class="token number">1</span></code></pre></div><h3 id="knapsack-packing">Knapsack packing</h3><p>One last way to optimize autoscaling is to carefully set the resource requests on your pods to avoid individual pods with large requests.</p><p>It's important to remember that for the autoscaler to scale down a node, it must be able to schedule all pods on that node onto other nodes, without scaling any node groups up. A node with only one pod cannot be scaled down if that pod doesn't fit on any other node. This is true even if the cluster as a whole has enough resources to accommodate that one pod - Kubernetes does not try to plan a series of evictions that will more densely pack the nodes, and will simply skip scaling down.</p><p>The easiest way to avoid this is run pods with requests small enough that they can be easily packed onto nodes. The relevant value is the ratio of a pod's request to the allocatable amount of that resource on a given node. If a single pod requires a large proportion of a node's resources, it will be harder to evict that pod and harder to scale down any node running that pod.</p><p>It's impossible to give general guidelines on how to size pod requests, but I <em>can</em> tell you how to detect if this is causing an autoscaler to avoid scaling down - the relevant <code class="language-text">messageId</code> is <code class="language-text">"no.scale.down.node.no.place.to.move.pods"</code>:</p><div class="kg-card kg-code-card gatsby-highlight" data-language="python"><pre class="language-python"><code class="language-python">jsonPayload<span class="token punctuation">.</span>noDecisionStatus<span class="token punctuation">.</span>noScaleDown<span class="token punctuation">.</span>nodes<span class="token punctuation">.</span>reason<span class="token punctuation">.</span>messageId <span class="token operator">=</span> <span class="token string">"no.scale.down.node.no.place.to.move.pods"</span></code></pre></div><h2 id="lastly-cheat">Lastly: cheat</h2><p>This one is easy: just don't request resources!</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://images.unsplash.com/photo-1542185400-f1c993ecbea2?ixlib=rb-1.2.1&#x26;q=80&#x26;fm=jpg&#x26;crop=entropy&#x26;cs=tinysrgb&#x26;w=2000&#x26;fit=max&#x26;ixid=eyJhcHBfaWQiOjExNzczfQ" class="kg-image" alt="Tuning cluster-autoscaler on GKE" loading="lazy" width="5283" height="2972" srcset="https://images.unsplash.com/photo-1542185400-f1c993ecbea2?ixlib=rb-1.2.1&#x26;q=80&#x26;fm=jpg&#x26;crop=entropy&#x26;cs=tinysrgb&#x26;w=600&#x26;fit=max&#x26;ixid=eyJhcHBfaWQiOjExNzczfQ 600w, https://images.unsplash.com/photo-1542185400-f1c993ecbea2?ixlib=rb-1.2.1&#x26;q=80&#x26;fm=jpg&#x26;crop=entropy&#x26;cs=tinysrgb&#x26;w=1000&#x26;fit=max&#x26;ixid=eyJhcHBfaWQiOjExNzczfQ 1000w, https://images.unsplash.com/photo-1542185400-f1c993ecbea2?ixlib=rb-1.2.1&#x26;q=80&#x26;fm=jpg&#x26;crop=entropy&#x26;cs=tinysrgb&#x26;w=1600&#x26;fit=max&#x26;ixid=eyJhcHBfaWQiOjExNzczfQ 1600w, https://images.unsplash.com/photo-1542185400-f1c993ecbea2?ixlib=rb-1.2.1&#x26;q=80&#x26;fm=jpg&#x26;crop=entropy&#x26;cs=tinysrgb&#x26;w=2400&#x26;fit=max&#x26;ixid=eyJhcHBfaWQiOjExNzczfQ 2400w" sizes="(min-width: 720px) 720px"><figcaption>Photo by <a href="https://unsplash.com/@chernus_tr?utm_source=ghost&#x26;utm_medium=referral&#x26;utm_campaign=api-credit">Taras Chernus</a> / <a href="https://unsplash.com/?utm_source=ghost&#x26;utm_medium=referral&#x26;utm_campaign=api-credit">Unsplash</a></figcaption></img></figure><p>Though not usually the right answer, sometimes Kubernetes is not cut out to manage resources for your pods. Maybe you're okay with heavily loading a node or manually assigning some Pods to it (making it more <a href="http://cloudscaling.com/blog/cloud-computing/the-history-of-pets-vs-cattle/?ref=ghost.justin.palpant.us">like a pet, less like cattle</a>). Maybe you have database workloads and care more about managing IOPS than CPU and memory, or ML workloads where the only relevant resource is access to physical GPUs, and you want to share GPUs between pods (not yet supported on Kubernetes).</p><p>In this situation, you can do one of two things: drop the <code class="language-text">resource:</code> block of the PodSpec all together, or continue to take advantage of Pod resource <em>limits</em>, but without requests. Resource limits do not drive or affect autoscaling or pod assignment - they control cgroup CPU slice allocation, process memory limits for the OOM killer, and pod eviction priority when a node is OOM, but not autoscaling.</p><p>To do the latter, you have to be careful: a <code class="language-text">resource</code> spec without any <code class="language-text">requests:</code> <a href="https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/memory-default-namespace/?ref=ghost.justin.palpant.us#what-if-you-specify-a-container-s-limit-but-not-its-request">will default to matching requests to limits</a> according to the Kubernetes specification.</p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="yaml"><pre class="language-yaml"><code class="language-yaml"><span class="token key atrule">resources</span><span class="token punctuation">:</span>
  <span class="token key atrule">limits</span><span class="token punctuation">:</span>
    <span class="token key atrule">cpu</span><span class="token punctuation">:</span> <span class="token number">2</span>
    <span class="token key atrule">memory</span><span class="token punctuation">:</span> 2Gi</code></pre></div><figcaption>A Pod with this resource spec will also request 2 CPUs and 2Gi of memory, and will not be scheduled to nodes with less available resources than that. It will also trigger the autoscaler to create new nodes if resources are unavailable.</figcaption></figure><p> You must manually set <code class="language-text">requests:</code> to a small value:</p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="yaml"><pre class="language-yaml"><code class="language-yaml"><span class="token key atrule">resources</span><span class="token punctuation">:</span>
  <span class="token key atrule">limits</span><span class="token punctuation">:</span>
    <span class="token key atrule">cpu</span><span class="token punctuation">:</span> <span class="token number">2</span>
    <span class="token key atrule">memory</span><span class="token punctuation">:</span> 2Gi
  <span class="token key atrule">requests</span><span class="token punctuation">:</span>
    <span class="token key atrule">cpu</span><span class="token punctuation">:</span> 10m
    <span class="token key atrule">memory</span><span class="token punctuation">:</span> 16Mi
</code></pre></div><figcaption>A Pod with this resource spec will be prevented from consuming more than 2 CPUs or 2Gi of memory, but can be scheduled even on a very busy node because of the low resource requests</figcaption></figure><hr><p>With a bit of investigation and some small tweaks, the cluster autoscaler on GKE can be made to behave quite well. If you found this quick intro to some of the ways to control it, please get in touch and let me know at <a href="mailto:justin@palpant.us">justin@palpant.us</a> or via <a href="https://www.linkedin.com/in/jpalpant?ref=ghost.justin.palpant.us">LinkedIn</a> or <a href="https://twitter.com/justin_palpant?ref=ghost.justin.palpant.us">Twitter</a>.</p></hr>]]></content:encoded></item><item><title><![CDATA[Simulating user traffic with Chrome and Golang]]></title><description><![CDATA[Load testing is a critical part of making sure your website or web application is robust to heavy traffic. I tried a few load testing tools and built one of my own with Go and Chrome, and learned some of the ins-and-outs of load testing a user-facing website and what you can learn from it.]]></description><link>https://justin.palpant.us/simulating-user-traffic-with-chrome-and-golang/</link><guid isPermaLink="false">Ghost__Post__5ebc69ab3c740f00065e6f9e</guid><category><![CDATA[infrastructure]]></category><category><![CDATA[testing]]></category><category><![CDATA[grafana]]></category><category><![CDATA[golang]]></category><category><![CDATA[chrome]]></category><dc:creator><![CDATA[Justin Palpant]]></dc:creator><pubDate>Thu, 04 Jun 2020 02:05:07 GMT</pubDate><media:content url="https://justin.palpant.us/static/66525d24a607c10dc25529a0038734ca/chrome-load-agent-blog-img.jpg" medium="image"/><content:encoded><![CDATA[<img src="https://justin.palpant.us/static/66525d24a607c10dc25529a0038734ca/chrome-load-agent-blog-img.jpg" alt="Simulating user traffic with Chrome and Golang"/><p>For any website, app, or product you support, there are a few dimensions to providing a good experience to your users, like availability, because no one likes error messages, or latency, because interactions should be smooth and quick.</p><p>Load testing is a great way to expose bottlenecks, fragility, and performance issues in your application. By adding a large amount of traffic in a controlled manner, you can often spot issues. And it never hurts to be prepared for what might happen if your blog goes viral!</p><p>My interest in automated load testing came about because I noticed that when running simple tests, the performance characteristics of my sites (like <a href="https://gitlab.palpant.us/justin/palpantlab-infra?ref=ghost.justin.palpant.us">my Gitlab instance</a>, or this blog) became totally unpredictable - different than what they were in the "steady-state" (with little to no traffic).</p><p>There are many ways to load test a website - so lets start with the most basic.</p><h2 id="making-some-http-requests">Making some HTTP requests</h2><p>If you want to make sure that the HTTP-serving components of your system are performant, making simple HTTP requests to a public website is easy, and can be done at high scale with minimal resources. </p><p>A consumer-grade laptop running cURL script can easily make hundreds of requests per second. <a href="https://httpd.apache.org/docs/2.4/programs/ab.html?ref=ghost.justin.palpant.us">ab</a>, a classic tool from Apache, and the more modern <a href="https://github.com/wg/wrk?ref=ghost.justin.palpant.us">wrk</a>, take the basic principle of cURL and provide configurable parallelism and (in the basic cases) high QPS, as well as statistic reporting, which can give you an idea of the range of latency and throughput characteristics of a server.</p><p>All of these tools can be run with almost no overhead, maxing out a basic webserver while consuming little to no CPU or memory on the load test machine.</p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">$ wrk -t <span class="token number">8</span> -c <span class="token number">100</span> -d 120s https://justin.palpant.us/folding-home-on-kubernetes/
Running 2m <span class="token builtin class-name">test</span> @ https://justin.palpant.us/folding-home-on-kubernetes/
  <span class="token number">8</span> threads and <span class="token number">100</span> connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   <span class="token number">257</span>.73ms  <span class="token number">312</span>.05ms   <span class="token number">1</span>.97s    <span class="token number">81.28</span>%
    Req/Sec    <span class="token number">50.72</span>     <span class="token number">25.21</span>   <span class="token number">151.00</span>     <span class="token number">66.31</span>%
  <span class="token number">46582</span> requests <span class="token keyword">in</span> <span class="token number">2</span>.00m, <span class="token number">4</span>.50GB <span class="token builtin class-name">read</span>
  Socket errors: connect <span class="token number">0</span>, <span class="token builtin class-name">read</span> <span class="token number">0</span>, <span class="token function">write</span> <span class="token number">0</span>, <span class="token function">timeout</span> <span class="token number">756</span>
Requests/sec:    <span class="token number">387.87</span>
Transfer/sec:     <span class="token number">38</span>.34MB</code></pre></div><figcaption><p><a href="https://github.com/wg/wrk?ref=ghost.justin.palpant.us"><span style="white-space: pre-wrap;">wrk</span></a><span style="white-space: pre-wrap;"> can easily make hundreds or thousands of HTTP requests per second from a laptop</span></p></figcaption></figure><p>However, these tools and tools like them have a weakness - for a webserver that serves a web application that users interact with via a browser, the load induced by a single HTTP request doesn't really mimic what would happen if a large number of users began to visit.</p><p>A user visiting a web page often has more side effects that cause load on your servers than one or even several HTTP requests.</p><ul><li>Additional dynamic HTTP calls based on executed Javascript</li><li>Server-side rendering of new components</li><li>Server-side caching based on request headers</li></ul><figure class="kg-card kg-gallery-card kg-width-wide kg-card-hascaption"><div class="kg-gallery-container"><div class="kg-gallery-row"><div class="kg-gallery-image"><img src="https://ghost.justin.palpant.us/content/images/2020/06/Screenshot-2020-06-03-16.26.53.png" width="659" height="661" loading="lazy" alt="Simulating user traffic with Chrome and Golang" srcset="https://ghost.justin.palpant.us/content/images/size/w600/2020/06/Screenshot-2020-06-03-16.26.53.png 600w, https://ghost.justin.palpant.us/content/images/2020/06/Screenshot-2020-06-03-16.26.53.png 659w"/></div><div class="kg-gallery-image"><img src="https://ghost.justin.palpant.us/content/images/2020/06/Screenshot-2020-06-03-16.27.04.png" width="1862" height="833" loading="lazy" alt="Simulating user traffic with Chrome and Golang" srcset="https://ghost.justin.palpant.us/content/images/size/w600/2020/06/Screenshot-2020-06-03-16.27.04.png 600w, https://ghost.justin.palpant.us/content/images/size/w1000/2020/06/Screenshot-2020-06-03-16.27.04.png 1000w, https://ghost.justin.palpant.us/content/images/size/w1600/2020/06/Screenshot-2020-06-03-16.27.04.png 1600w, https://ghost.justin.palpant.us/content/images/2020/06/Screenshot-2020-06-03-16.27.04.png 1862w" sizes="(min-width: 720px) 720px"/></div></div></div><figcaption><p><span style="white-space: pre-wrap;">One HTTP request with cURL yields some HTML and no requests to metric-serving backends; but the browser executes JS and makes additional requests, placing load on the entire system</span></p></figcaption></figure><h2 id="simulate-the-user">Simulate the user</h2><p>To overcome these limitations, you can try to expose your servers to load that, from the server's perspective, appears to be regular user traffic. Can't find a few hundred or thousand people to reload your blog all day? <em>No problem!</em> Fortunately, there are a few ways to automate this.</p><h3 id="3rd-party-tools">3rd-party tools</h3><p>Services such as <a href="https://flood.io/?ref=ghost.justin.palpant.us">flood.io</a> can help with this - they provide a tool to simulate a huge number of users, geographically distributed, interacting with your website in simple ways. For more advanced cases, you can even provide Selenium scripts to execute complex sequences of interactions with a website. I use and will continue to use Flood for occasional <em>high-stress </em>testing.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://ghost.justin.palpant.us/content/images/2020/05/Screenshot-2020-05-20-20.31.43.png" class="kg-image" alt="Simulating user traffic with Chrome and Golang" loading="lazy" width="2000" height="1096" srcset="https://ghost.justin.palpant.us/content/images/size/w600/2020/05/Screenshot-2020-05-20-20.31.43.png 600w, https://ghost.justin.palpant.us/content/images/size/w1000/2020/05/Screenshot-2020-05-20-20.31.43.png 1000w, https://ghost.justin.palpant.us/content/images/size/w1600/2020/05/Screenshot-2020-05-20-20.31.43.png 1600w, https://ghost.justin.palpant.us/content/images/2020/05/Screenshot-2020-05-20-20.31.43.png 2000w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">flood.io load test simulating 250 constant visitors to this blog for 15 minutes</span></figcaption></img></figure><p>But Flood and similar tools have some limitations. Importantly, though they scale to large numbers, they are not meant to simulate sustained load - jobs typically run for minutes or hours, but not days or weeks. On top of that, the tools are expensive. Flood <a href="https://flood.io/pricing?ref=ghost.justin.palpant.us">offers 500 virtual user-hours per month for free</a>, and then $0.045/virtual user-hour thereafter. While this is great for a few bursts, simulating continuous load of only 10 users would cost upwards of $300/month.</p><h3 id="another-way">Another way</h3><p>I thought it would be interesting to try to build something that would provide a tunable way to load web pages and induce the corresponding stress on my monitoring stack (in the absence of a horde of developers constantly refreshing dashboards). My<a href="https://gitlab.palpant.us/justin/chromedp-load-agent?ref=ghost.justin.palpant.us#goals"> goals were simple and specific</a>: simulate a full page load on an authenticated web page, in an indefinite loop, with some control over the parallelism on the client side.</p><p>For years, driving Chrome or other browsers via automation has been a staple of integration testing via frameworks like Selenium. Today, the <a href="https://chromedevtools.github.io/devtools-protocol/?ref=ghost.justin.palpant.us">Chrome DevTools Protocol</a>, a gRPC API maintained by Google, facilitates this. It allows programmatic control of a Chrome browser instance from another process.</p><p>Several libraries wrapping the DevTools Protocol have been made for different languages to allow fine-grained control: <a href="https://github.com/puppeteer/puppeteer?ref=ghost.justin.palpant.us">Puppeteer</a> for NodeJS (maintained by the Chrome development team), <a href="/p/b8890c39-cf93-4898-aea0-f31725f31f0e/github.com/chromedp/chromedp">github.com/chromedp/chromedp</a> for golang, <a href="https://docs.rs/headless_chrome/0.9.0/headless_chrome/?ref=ghost.justin.palpant.us">headless_chrome</a> for Rust. These libraries are at varying levels of maturity and full-featured, and if you are interested in building a new solution for driving Chrome generally, are a great place to start!</p><p>On top of these libraries, a few enterprising developers have built load-testing tools, such as <a href="https://github.com/svenkatreddy/puppeteer-loadtest?ref=ghost.justin.palpant.us">puppeteer-loadtest</a> and the powerful <a href="https://github.com/thomasdondorf/puppeteer-cluster?ref=ghost.justin.palpant.us">puppeteer-cluster</a>. </p><p>Given that I'm more familiar with golang than the other two languages, I thought to see if I could scratch out a simple binary to meet my goals from chromedp.</p><h2 id="gitlabpalpantusjustinchromedp-load-agent"><a href="https://gitlab.palpant.us/justin/chromedp-load-agent?ref=ghost.justin.palpant.us">gitlab.palpant.us/justin/chromedp-load-agent</a></h2><p>Courtesy of the power and ease of golang, chromedp, Docker, and Kubernetes, in just a handful of hours I made a tool that:</p><ul><li>Loads web pages continuously in a headless Chrome browser, with URLs specified via file or CLI argument</li><li>Simulates a complete page load, including awaiting <code class="language-text">load</code> and optionally <code class="language-text">networkIdle0</code> events (<a href="https://github.com/puppeteer/puppeteer/blob/master/docs/api.md?ref=ghost.justin.palpant.us#pagegotourl-options">meaning no network requests have been made for 500ms</a>), with configurable timeout</li><li>Supports arbitrary HTTP headers, TLS verification using default CAs, or skipping TLS verification (for unsigned HTTPS websites).</li><li>Configurable parallelism via a reusable pool of browser tabs</li><li>Can be (<a href="https://gitlab.palpant.us/justin/chromedp-load-agent/-/blob/master/deploy/kubectl-apply/gke/deployment.yaml?ref=ghost.justin.palpant.us">and is!</a>) run on Kubernetes, with <a href="https://hub.docker.com/r/jpalpant/chromedp-load-agent?ref=ghost.justin.palpant.us">jpalpant/chromedp-load-agent</a> published to DockerHub</li><li>Can take screenshots of the page to validate a successful page load</li></ul><figure class="kg-card kg-gallery-card kg-width-wide kg-card-hascaption"><div class="kg-gallery-container"><div class="kg-gallery-row"><div class="kg-gallery-image"><img src="https://ghost.justin.palpant.us/content/images/2020/05/https_gitlab_palpant_us_grafana_d_dji_liazz_cluster_metrics_usage_orgid_1_refresh_30s.png" width="1920" height="1080" loading="lazy" alt="Simulating user traffic with Chrome and Golang" srcset="https://ghost.justin.palpant.us/content/images/size/w600/2020/05/https_gitlab_palpant_us_grafana_d_dji_liazz_cluster_metrics_usage_orgid_1_refresh_30s.png 600w, https://ghost.justin.palpant.us/content/images/size/w1000/2020/05/https_gitlab_palpant_us_grafana_d_dji_liazz_cluster_metrics_usage_orgid_1_refresh_30s.png 1000w, https://ghost.justin.palpant.us/content/images/size/w1600/2020/05/https_gitlab_palpant_us_grafana_d_dji_liazz_cluster_metrics_usage_orgid_1_refresh_30s.png 1600w, https://ghost.justin.palpant.us/content/images/2020/05/https_gitlab_palpant_us_grafana_d_dji_liazz_cluster_metrics_usage_orgid_1_refresh_30s.png 1920w" sizes="(min-width: 720px) 720px"/></div><div class="kg-gallery-image"><img src="https://ghost.justin.palpant.us/content/images/2020/05/https_gitlab_palpant_us_grafana_d_hb7fse0zz_node_status_orgid_1_refresh_1m_from_now_1h_to_now.png" width="1920" height="1080" loading="lazy" alt="Simulating user traffic with Chrome and Golang" srcset="https://ghost.justin.palpant.us/content/images/size/w600/2020/05/https_gitlab_palpant_us_grafana_d_hb7fse0zz_node_status_orgid_1_refresh_1m_from_now_1h_to_now.png 600w, https://ghost.justin.palpant.us/content/images/size/w1000/2020/05/https_gitlab_palpant_us_grafana_d_hb7fse0zz_node_status_orgid_1_refresh_1m_from_now_1h_to_now.png 1000w, https://ghost.justin.palpant.us/content/images/size/w1600/2020/05/https_gitlab_palpant_us_grafana_d_hb7fse0zz_node_status_orgid_1_refresh_1m_from_now_1h_to_now.png 1600w, https://ghost.justin.palpant.us/content/images/2020/05/https_gitlab_palpant_us_grafana_d_hb7fse0zz_node_status_orgid_1_refresh_1m_from_now_1h_to_now.png 1920w" sizes="(min-width: 720px) 720px"/></div><div class="kg-gallery-image"><img src="https://ghost.justin.palpant.us/content/images/2020/05/https_gitlab_palpant_us_grafana_d_i8erisazk_availability_orgid_1.png" width="1920" height="1080" loading="lazy" alt="Simulating user traffic with Chrome and Golang" srcset="https://ghost.justin.palpant.us/content/images/size/w600/2020/05/https_gitlab_palpant_us_grafana_d_i8erisazk_availability_orgid_1.png 600w, https://ghost.justin.palpant.us/content/images/size/w1000/2020/05/https_gitlab_palpant_us_grafana_d_i8erisazk_availability_orgid_1.png 1000w, https://ghost.justin.palpant.us/content/images/size/w1600/2020/05/https_gitlab_palpant_us_grafana_d_i8erisazk_availability_orgid_1.png 1600w, https://ghost.justin.palpant.us/content/images/2020/05/https_gitlab_palpant_us_grafana_d_i8erisazk_availability_orgid_1.png 1920w" sizes="(min-width: 720px) 720px"/></div></div><div class="kg-gallery-row"><div class="kg-gallery-image"><img src="https://ghost.justin.palpant.us/content/images/2020/05/https_gitlab_palpant_us_grafana_d_nginx_nginx_ingress_controller_orgid_1_refresh_10s.png" width="1920" height="1080" loading="lazy" alt="Simulating user traffic with Chrome and Golang" srcset="https://ghost.justin.palpant.us/content/images/size/w600/2020/05/https_gitlab_palpant_us_grafana_d_nginx_nginx_ingress_controller_orgid_1_refresh_10s.png 600w, https://ghost.justin.palpant.us/content/images/size/w1000/2020/05/https_gitlab_palpant_us_grafana_d_nginx_nginx_ingress_controller_orgid_1_refresh_10s.png 1000w, https://ghost.justin.palpant.us/content/images/size/w1600/2020/05/https_gitlab_palpant_us_grafana_d_nginx_nginx_ingress_controller_orgid_1_refresh_10s.png 1600w, https://ghost.justin.palpant.us/content/images/2020/05/https_gitlab_palpant_us_grafana_d_nginx_nginx_ingress_controller_orgid_1_refresh_10s.png 1920w" sizes="(min-width: 720px) 720px"/></div><div class="kg-gallery-image"><img src="https://ghost.justin.palpant.us/content/images/2020/05/https_gitlab_palpant_us_grafana_d_qrxxp8pzz_storage_orgid_1.png" width="1920" height="1080" loading="lazy" alt="Simulating user traffic with Chrome and Golang" srcset="https://ghost.justin.palpant.us/content/images/size/w600/2020/05/https_gitlab_palpant_us_grafana_d_qrxxp8pzz_storage_orgid_1.png 600w, https://ghost.justin.palpant.us/content/images/size/w1000/2020/05/https_gitlab_palpant_us_grafana_d_qrxxp8pzz_storage_orgid_1.png 1000w, https://ghost.justin.palpant.us/content/images/size/w1600/2020/05/https_gitlab_palpant_us_grafana_d_qrxxp8pzz_storage_orgid_1.png 1600w, https://ghost.justin.palpant.us/content/images/2020/05/https_gitlab_palpant_us_grafana_d_qrxxp8pzz_storage_orgid_1.png 1920w" sizes="(min-width: 720px) 720px"/></div><div class="kg-gallery-image"><img src="https://ghost.justin.palpant.us/content/images/2020/05/https_gitlab_palpant_us_grafana_d_v3upgg3wz_foldingathome_orgid_1_refresh_30s.png" width="1920" height="1080" loading="lazy" alt="Simulating user traffic with Chrome and Golang" srcset="https://ghost.justin.palpant.us/content/images/size/w600/2020/05/https_gitlab_palpant_us_grafana_d_v3upgg3wz_foldingathome_orgid_1_refresh_30s.png 600w, https://ghost.justin.palpant.us/content/images/size/w1000/2020/05/https_gitlab_palpant_us_grafana_d_v3upgg3wz_foldingathome_orgid_1_refresh_30s.png 1000w, https://ghost.justin.palpant.us/content/images/size/w1600/2020/05/https_gitlab_palpant_us_grafana_d_v3upgg3wz_foldingathome_orgid_1_refresh_30s.png 1600w, https://ghost.justin.palpant.us/content/images/2020/05/https_gitlab_palpant_us_grafana_d_v3upgg3wz_foldingathome_orgid_1_refresh_30s.png 1920w" sizes="(min-width: 720px) 720px"/></div></div></div><figcaption><p><span style="white-space: pre-wrap;">Post-load screenshots from my Grafana instance showing all queries complete, taken with </span><code spellcheck="false" style="white-space: pre-wrap;"><span>chromedp-load-agent test</span></code><span style="white-space: pre-wrap;"> while visiting </span><a href="https://gitlab.palpant.us/justin/chromedp-load-agent/-/blob/master/samples/grafana.txt?ref=ghost.justin.palpant.us"><span style="white-space: pre-wrap;">these dashboards</span></a></p></figcaption></figure><h3 id="whats-left">What's left</h3><p>Right now, <a href="https://gitlab.palpant.us/justin/chromedp-load-agent?ref=ghost.justin.palpant.us">chromedp-load-agent</a> is untested, the code isn't very well organized (on account of being my first project using <a href="https://github.com/spf13/cobra?ref=ghost.justin.palpant.us">spf13/cobra</a>), it's expensive to run, and it's brittle. While functional, as a long-running service it has a lot of limitations. I'm not interested in making it into a library, but if I can, I'd love to improve other aspects:</p><ul><li>Health checks suitable for a long-running server process</li><li>Prometheus metrics for application statistics, like successful or failed page loads</li><li>More utility to screenshots, like a web interface to show the most recent screenshot for each URL</li></ul><p>Beyond that, there's a lot of potential for a reusable library that automates this work in Golang, but I think that's not a direction I want to pursue right now.</p><p>But the important thing: it works. I set out to add consistent, configurable load to my monitoring system, and this is what QPS looks like now:</p>
<!--kg-card-begin: html-->
<iframe src="https://grafana.palpant.us/dashboard-solo/snapshot/3APZiXtLR2zaelyRSQM2vSApnY1OC2Ex?orgId=1&#x26;from=1589039659054&#x26;to=1589266799000&#x26;var-DS_PROMETHEUS=&#x26;var-namespace=All&#x26;var-controller_class=nginx&#x26;var-controller=All&#x26;var-ingress=gitlab-cloudnative-grafana&#x26;panelId=86" style="width:100%" height="400" frameborder="0"/>
<!--kg-card-end: html-->
<p>So, did it reveal anything interesting? Or was this exercise a waste of time?</p><h2 id="what-i-learned">What I learned</h2><h3 id="cloud-compute-costs">Cloud-compute costs</h3><p>The first thing that surprised me was how quickly the TCO of my small websites increased under constant load. I run most things on GCP via a GKE cluster, and of course my personal Grafana instance sees very little traffic day-to-day, but I wouldn't have guessed how and <em>how much</em> an increase in traffic would cost.</p><p>As with most cloud providers, Google charges for a wide variety of usage SKUs. Some like CPU and memory are obvious, while others like static IP addresses, load balancer rules, cluster management fees, and monthly storage fees, less obvious, but still intuitive. However, the constant page loads suddenly revealed a variety of non-intuitive costs for services that were previously inexpensive. The main culprit turned out to be<em> (drumroll)...</em></p>
<!--kg-card-begin: html-->
<iframe src="https://giphy.com/embed/116seTvbXx07F6" style="width:100%" height="400" frameborder="0" class="giphy-embed" allowfullscreen=""/><p><a href="https://giphy.com/gifs/mardi-gras-116seTvbXx07F6?ref=ghost.justin.palpant.us">
<!--kg-card-end: html-->
</a></p><p><a href="https://giphy.com/gifs/mardi-gras-116seTvbXx07F6?ref=ghost.justin.palpant.us"><strong><em>Log ingestion</em></strong></a></p><p><a href="https://giphy.com/gifs/mardi-gras-116seTvbXx07F6?ref=ghost.justin.palpant.us">Believe it or not, GCP makes you pay through the nose for log ingestion once you pass a </a><a href="https://cloud.google.com/stackdriver/pricing?ref=ghost.justin.palpant.us#logging-costs">free usage threshold of 50GB</a>. With NGINX logs and traces from various services being emitted on every request, even my small cluster consumed that allotment and rapidly started accruing log charges, at a rate of $0.50/GB. Fortunately, Cloud Logging allows <a href="https://cloud.google.com/logging/docs/exclusions?ref=ghost.justin.palpant.us">flexible exclusion filters</a>, and I was able to bring that cost back under control.</p><p>Beyond logs, I also noticed a sharp spike in charges due to <strong>Network Egress</strong> (data leaving GCP because the load agent is downloading it), and <strong>GCS Class B requests</strong>. The latter was interesting to me, and difficult to resolve. I use <a href="https://github.com/thanos-io/thanos?ref=ghost.justin.palpant.us">Thanos</a> (a CNCF project) as <a href="https://thanos.palpant.us/bucket?ref=ghost.justin.palpant.us">part of my monitoring stack</a>, and Thanos serves metrics from Google Cloud Storage. Thanos Store and Thanos Query, the components responsible for handling requests for metrics, offer very little in the way of caching, so every page load required downloading a piece of a GCS object to eventually display to the visitor in a Grafana dashboard.</p><p>At a lower level, Thanos Store reports statistics on these operations - the relevant operation was the <code class="language-text">bucket get_range</code> request (from <a href="https://github.com/thanos-io/thanos/blob/master/pkg/objstore/gcs/gcs.go?ref=ghost.justin.palpant.us#L124">gcs.go</a>)</p>
<!--kg-card-begin: html-->
<iframe src="https://grafana.palpant.us/dashboard-solo/snapshot/ACfPj7wqmmmJdf0w8OrgZamD5ZbFDEwX?orgId=1&#x26;from=1589039659054&#x26;to=1589266799000&#x26;var-interval=10m&#x26;var-namespace=gitlab&#x26;var-labelselector=app&#x26;var-labelvalue=thanos-store&#x26;panelId=5" style="width:100%" height="400" frameborder="0"/>
<!--kg-card-end: html-->
<p>That operation makes a <a href="https://cloud.google.com/storage/pricing?ref=ghost.justin.palpant.us#operations-by-class">storage.*.get</a> request via the GCS JSON API, which is categorized as a Class B operation for billing. GCP charges <a href="https://cloud.google.com/storage/pricing?ref=ghost.justin.palpant.us#operations-pricing">$0.004/10,000 operations</a> for this type of request against a Standard bucket. While that seems small, 100QPS translates to more than $100/month - a substantial amount, compared to other costs on this cluster. For ways to deal with this, I'll write about what I did another time: in the meantime, watch <a href="https://www.youtube.com/watch?v=eyBbImSDOrI&#x26;feature=youtu.be&#x26;ref=ghost.justin.palpant.us">this amazing talk</a> by Tom Wilkie and read <a href="https://grafana.com/blog/2019/09/19/how-to-get-blazin-fast-promql/?ref=ghost.justin.palpant.us">the blog post</a>!</p><h3 id="cpu-bottlenecks-and-autoscaling">CPU bottlenecks and autoscaling</h3><p>To my surprise, memory usage for the pods I use stayed relatively stable when the load agent was enabled. However several pods showed drastic spikes in CPU usage. Notable among these were Grafana, Gitlab's Webservice (formerly "Unicorn"), NGINX, and the <a href="https://github.com/GoogleCloudPlatform/cloudsql-proxy?ref=ghost.justin.palpant.us">CloudSQL proxy</a> I use to tunnel my GCP-managed databases for secure access from within the Kubernetes cluster..</p><p>Normally, this would be fine - my cluster runs with excess capacity, and CPU is <a href="https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-resource-requests-and-limits?ref=ghost.justin.palpant.us">a compressible resource</a>, which means (in part) that it can be over-consumed without Kubernetes needing to terminate any pods or increase cluster capacity. Instead, the pods are throttled - prevented from consuming more CPU than their limits by not scheduling those processes.</p>
<!--kg-card-begin: html-->
<iframe src="https://grafana.palpant.us/dashboard-solo/snapshot/akxe78IMzfvI8VZVLSdNYzWnLAtDLHSH?orgId=0&#x26;from=1589039659054&#x26;to=1589266799000&#x26;var-Node=All&#x26;var-namespace=gitlab&#x26;var-pod=gitlab-cloudnative-grafana-744f47bf8b-ffmcc&#x26;var-pod=gitlab-cloudnative-grafana-744f47bf8b-jtpk9&#x26;var-pod=gitlab-cloudnative-grafana-744f47bf8b-zqmrp&#x26;var-cluster=gitlab-cloudnative-prometheus&#x26;panelId=49" style="width:100%" height="400" frameborder="0"/>
<!--kg-card-end: html-->
<p>Unfortunately, CPU throttling on any process that serves user traffic can lead to poor, as well as inconsistent, performance - a process in the midst of a request could suddenly be put on pause, delaying those requests being served and increasing their latency by an unpredictable amount - like this:</p>
<!--kg-card-begin: html-->
<iframe src="https://grafana.palpant.us/dashboard-solo/snapshot/Y9Ioo622FBFt7P5339KIymdTbYe3JFD4?orgId=1&#x26;from=1589039659054&#x26;to=1589266799000&#x26;var-DS_PROMETHEUS=&#x26;var-namespace=All&#x26;var-controller_class=nginx&#x26;var-controller=All&#x26;var-ingress=gitlab-cloudnative-grafana&#x26;var-interval=5m&#x26;panelId=93" style="width:100%" height="400" frameborder="0"/>
<!--kg-card-end: html-->
<p>There are a few ways to get around this. In some cases, the right choice is to simply increase the CPU allocation to your pods to prevent the bottleneck. If large changes in load are infrequent, you can pick an allocation that works for your expected load and leave it, updating it manually when need be.</p><p>Sometimes you can't predict what load you need to handle. For those cases, a <a href="https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/?ref=ghost.justin.palpant.us">HorizontalPodAutoscaler</a> can help. This Kubernetes primitive monitors the CPU or memory usage of all pods belonging to a Deployment and, if the usage exceeds a threshold, scales up the Deployment. If your Service is set up in the usual way, requests are automatically load balanced across the new and old Pods once all are available, reducing the CPU needed for each Pod. Scaling up or down is repeated until the CPU usage is within bounds, or the maximum or minimum number of Pods the HPA can use is reached.</p><p>For GitLab, which I deploy via the <a href="https://gitlab.com/gitlab-org/charts/gitlab?ref=ghost.justin.palpant.us">GitLab Cloudnative Helm Chart</a>, an HPA for the webservice deployment can be configured via the <a href="https://gitlab.palpant.us/justin/palpantlab-gitlab/-/blob/master/deploy/helm-upgrade/gke/gitlab-values.yaml?ref=ghost.justin.palpant.us#L149-150"><code class="language-text">gitlab.webservice.hpa</code></a> field, as described <a href="https://gitlab.com/gitlab-org/charts/gitlab/-/tree/master/doc/charts/gitlab/webservice?ref=ghost.justin.palpant.us#installation-command-line-options">in the docs</a>. NGINX ingress similarly offers easy HPA configuration, and any HPA for any deployment can be made with <a href="https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/?ref=ghost.justin.palpant.us#support-for-horizontal-pod-autoscaler-in-kubectl"><code class="language-text">kubectl autoscale</code></a> as well.</p><hr><p>If you are thinking about how to load test your website, application, or product, I hope some of this has been useful information! If you have any feedback or suggestions, or are interested in <a href="https://gitlab.palpant.us/justin/chromedp-load-agent?ref=ghost.justin.palpant.us">chromedp-load-agent</a> or similar tools, please <a href="https://justin.palpant.us/get-in-touch/?ref=ghost.justin.palpant.us">get in touch</a>! I'm always learning and looking for input, and happy to chat about infrastructure any time.</p></hr>]]></content:encoded></item><item><title><![CDATA[Maximizing NVIDIA GPU performance on Linux]]></title><description><![CDATA[Have an NVIDIA GPU that you use for gaming, HPC, or machine learning, and want to get maximum performance out of it? Learn some tips and tricks to make sure your GPU isn't being silently throttled.]]></description><link>https://justin.palpant.us/monitor-and-maximize-nvidia-gpu-performance-on-linux/</link><guid isPermaLink="false">Ghost__Post__5e7a3d311273fa00083d6980</guid><category><![CDATA[gpu]]></category><category><![CDATA[kernel]]></category><category><![CDATA[linux]]></category><category><![CDATA[nvidia]]></category><category><![CDATA[systemd]]></category><category><![CDATA[gaming]]></category><dc:creator><![CDATA[Justin Palpant]]></dc:creator><pubDate>Sun, 26 Apr 2020 20:18:59 GMT</pubDate><media:content url="https://justin.palpant.us/static/73904e6a31ad065de466e442438de0ea/Screenshot-2020-04-26-15.27.21.png" medium="image"/><content:encoded><![CDATA[<img src="https://justin.palpant.us/static/73904e6a31ad065de466e442438de0ea/Screenshot-2020-04-26-15.27.21.png" alt="Maximizing NVIDIA GPU performance on Linux"/><p>I got an NVIDIA RTX 2080 Super a few months ago. It's a great piece of hardware and up for anything I can throw at it, which so far includes Metro Exodus, Half-Life: Alyx, Folding@Home, and more. But out-of-the-box it was 15<em>% less performant</em> than it currently is when reporting maximum utilization. With a bit of debugging and a few small changes to the system, I've managed to reclaim that performance. Here's what I learned.</p><p><em>This post focuses on finding and addressing bottlenecks affecting GPU compute, but graphics processing can be slowed by many components: a slow CPU can prevent a GPU from running at maximum speed by failing to provide it with work quickly enough; a machine learning task that requires large amounts of data transfer may be limited elsewhere, such as GPU memory bandwidth, disk, or network activity. Rule these out first. A good rule-of-thumb is to check that GPU utilization is being reported as nearly 100%, while other components are not at their maximums.</em></p><h2 id="identifying-the-potential-for-more-performance">Identifying the potential for more performance</h2><p>I started investigating my GPU's performance after two observations: the first was that latency-sensitive VR games would sometimes stutter or jerk before becoming smooth again, with brief large spikes in frame latency (going from sub-6ms times up to 15-18ms for brief fractions of a second); the second was that when running at maximum utilization, my GPU temperature was pinned at 86<strong>°</strong>C with the GPU fans running at full speed.</p>
<!--kg-card-begin: html-->
<iframe title="GPU temperature graph" src="https://grafana.palpant.us/dashboard-solo/snapshot/63ANDwSEORBvBhwauNFOhWHXAKoDoWKw?orgId=1&#x26;from=1584721377610&#x26;to=1584772261245&#x26;panelId=4" style="width:100%" height="400" frameborder="0"/>
<!--kg-card-end: html-->
<p>Now, a bit of frame drop in a demanding game could maybe be expected, new GPU or not. And it's hard to find good information about what qualifies as "high" temperatures for a GPU, and what the effects of running at high temperatures are. Still, 86<strong>°</strong>C is warm, and since my case is a <a href="https://www.fractal-design.com/products/cases/node/node-202/black/?ref=ghost.justin.palpant.us">Fractal Node 202</a>, an extremely compact mini-ITX that clocks in at 10.2L, cooling was at the top of my mind. I started to learn about what happens to a GPU as it reaches thermal maximums.</p><h3 id="sm-clock-throttling">SM Clock Throttling</h3><p>It turns out, what an NVIDIA GPU will do in order to stay cool is reduce the clock frequency of the streaming multiprocessor (SM) units, which contain CUDA cores, resulting in a decrease in performance that is proportional to the decrease in frequency for tasks running on these cores. The sign of a throttled GPU is a SM frequency that is uneven - full-power GPUs maintain a stable clock frequency. </p>
<!--kg-card-begin: html-->
<iframe title="GPU frequency graph" src="https://grafana.palpant.us/dashboard-solo/snapshot/vKRI9MBO6eKcVbMtab7ED8JeqWtL5Rb0?orgId=1&#x26;from=1584721377610&#x26;to=1584772261245&#x26;panelId=6" style="width:100%" height="400" frameborder="0"/>
<!--kg-card-end: html-->
<p><em>Throttling confirmed!</em> The <code class="language-text">SM Clock</code> plot showed clear signs of throttling - spiking constantly between 1770MHz and 1690Gz, and even dropping to 1650MHz for a sustained window. The reference RTX 2080 Super has a base clock of 1650MHz, with a boost clock of 1815MHz, so these would seem to be good speeds, but the instability in the frequency meant something was wrong.</p><p>On Windows, third-party programs like <a href="https://en.wikipedia.org/wiki/GPU-Z?ref=ghost.justin.palpant.us">GPU-Z</a> can help you detect this by showing a graph of GPU frequency over time. On Linux, the job is somewhat more difficult: you can run <code class="language-text">nvidia-smi -q -d CLOCK</code> to ask for the GPU frequency, but must run this repeatedly to see if the clock frequency is changing.</p><p>For those of us on Linux and without datacenter-style monitoring, though, there's an easier way!</p><h3 id="performance">PERFORMANCE</h3><p>Just run <code class="language-text">nvidia-smi -q -d PERFORMANCE</code></p><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">$ nvidia-smi -q -d PERFORMANCE

<span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span>NVSMI <span class="token assign-left variable">LOG</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span>

Driver Version                      <span class="token builtin class-name">:</span> <span class="token number">440.66</span>.08
CUDA Version                        <span class="token builtin class-name">:</span> <span class="token number">10.2</span>

Attached GPUs                       <span class="token builtin class-name">:</span> <span class="token number">1</span>
GPU 00000000:01:00.0
    Performance State               <span class="token builtin class-name">:</span> P2
    Clocks Throttle Reasons
        Idle                        <span class="token builtin class-name">:</span> Not Active
        Applications Clocks Setting <span class="token builtin class-name">:</span> Not Active
        SW Power Cap                <span class="token builtin class-name">:</span> Not Active
        HW Slowdown                 <span class="token builtin class-name">:</span> Not Active
            HW Thermal Slowdown     <span class="token builtin class-name">:</span> Not Active
            HW Power Brake Slowdown <span class="token builtin class-name">:</span> Not Active
        Sync Boost                  <span class="token builtin class-name">:</span> Not Active
        SW Thermal Slowdown         <span class="token builtin class-name">:</span> Active
        Display Clock Setting       <span class="token builtin class-name">:</span> Not Active</code></pre></div><p>This is the best list of active throttles I've seen, and when I was investigating, it clearly and consistently showed <code class="language-text">SW Thermal Slowdown</code> - my GPU was too hot. Not hot enough to trigger the emergency brake that is a hardware slowdown, but hot enough to affect performance. Next up was to figure out how to fix it.<em>*</em></p><h2 id="gpu-tuned-air-cooling-on-linux">GPU-tuned air-cooling on Linux</h2><p>It was at this point that I learned something lucky: I had made a dumb mistake in my build, forgotten that the Fractal Node 202 has space for <a href="https://www.fractal-design.com/wp-content/uploads/2019/07/NODE-202-PS.pdf?ref=ghost.justin.palpant.us">two case fans beneath the GPU</a>. These are meant to be static pressure fans, pulling cool air in from outside, with the resulting hot air vented out by the CPU fan. I could add two <a href="https://www.amazon.com/gp/product/B01G5I6MYI/ref=ppx_yo_dt_b_asin_title_o06_s00?ie=UTF8&#x26;psc=1&#x26;ref=ghost.justin.palpant.us">Corsair ML120 Pro Blue 120mm</a> fan as case fans easily enough.</p><h3 id="improving-fan-control">Improving Fan Control</h3><p>My mini-ITX motherboard is the <a href="https://www.gigabyte.com/us/Motherboard/Z390-I-AORUS-PRO-WIFI-rev-10?ref=ghost.justin.palpant.us#kf">Gigabyte Z390 I Aorus Pro Wifi</a>, which has three fan headers and comes with the Smart Fan 5 fan control software in the BIOS. This was sufficient to make sure the fans turned on with default settings, but the control with <a href="https://www.gigabyte.com/mb/am4/cooling?ref=ghost.justin.palpant.us">Smart Fan 5</a> is limited - you can tie any of your fans to either the CPU temperatures, the PCH temperature, or a ambient temperature sensor somewhat removed from the CPU, and the available fan curves are highly customizable, but finicky. </p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://ghost.justin.palpant.us/content/images/2020/04/smartfan5-4.jpg" class="kg-image" alt="Maximizing NVIDIA GPU performance on Linux" loading="lazy" width="771" height="504" srcset="https://ghost.justin.palpant.us/content/images/size/w600/2020/04/smartfan5-4.jpg 600w, https://ghost.justin.palpant.us/content/images/2020/04/smartfan5-4.jpg 771w" sizes="(min-width: 720px) 720px"><figcaption><a href="https://www.gigabyte.com/mb/am4/cooling?ref=ghost.justin.palpant.us"><span style="white-space: pre-wrap;">Smart Fan 5</span></a><span style="white-space: pre-wrap;"> supports multiple fans with complex fan curves, but motherboard temperature sensors weren't a good choice to eliminate thermal throttling on the GPUs</span></figcaption></img></figure><p>Unfortunately tying case fan speed to ambient temperature meant that these fans wouldn't spin up when the GPUs were under load; tying it to CPU temperature meant that the fans would rapidly spin up and down even when the GPU was inactive, as CPU temperatures tend to be more variable than the temperatures of other components. Neither solution was sufficient.</p><h3 id="lm-sensors-and-fancontrol">lm-sensors and fancontrol</h3><p>The go-to for fan speed control on Linux is a combination of <a href="https://github.com/lm-sensors/lm-sensors?ref=ghost.justin.palpant.us">lm-sensors</a>, a powerful general-purpose hardware monitoring package, and <a href="http://manpages.ubuntu.com/manpages/bionic/man8/fancontrol.8.html?ref=ghost.justin.palpant.us">fancontrol</a>, a simple but useful script that monitors arbitrary temperature sensors and controls PWM outputs, in an infinite loop. On Ubuntu, both can be installed with <code class="language-text">apt</code> and configured:</p><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">$ <span class="token function">sudo</span> <span class="token function">apt</span> <span class="token function">install</span> lm-sensors
$ <span class="token function">sudo</span> sensors-detect
$ <span class="token function">sudo</span> pwmconfig</code></pre></div><p>For many systems this is sufficient to expose the temperature sensors for CPU temperature as well as the PWM outputs and sensors which provide fan speed control and feedback.</p><p>However, this doesn't work on this particular Gigabyte motherboard. </p><p>The Gigabyte motherboard uses a temperature sensor which isn't natively supported by the Linux kernel. Fortunately, there was once an enterprising developer who made a kernel module, it87.ko, which supports a large number of sensors of this type. The original maintainer chose to <a href="https://www.phoronix.com/scan.php?page=news_item&#x26;px=IT87-Linux-Driver-Axing&#x26;ref=ghost.justin.palpant.us">stop maintaining the repository</a>, but several forks exist. I chose <a href="https://github.com/hannesha/it87?ref=ghost.justin.palpant.us">hannesha/it87</a>, and compiled the DKMS module to make sure it continues to be compiled for future kernels I install.</p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">$ <span class="token builtin class-name">cd</span> ~
$ <span class="token function">git</span> clone https://github.com/hannesha/it87
$ <span class="token builtin class-name">cd</span> it87
$ <span class="token function">make</span>
$ <span class="token function">sudo</span> <span class="token function">make</span> dkms</code></pre></div><figcaption><p><span style="white-space: pre-wrap;">Install it87.ko to add support for the Gigabyte Z390 fan control and sensors</span></p></figcaption></figure><p>To enable an installed module like this, you would typically use <a href="https://linux.die.net/man/8/modprobe?ref=ghost.justin.palpant.us">modprobe</a>, but here there was an issue: this repository is not kept up-to-date with newer motherboard specifications, and so when it attempts to detect the relevant hardware (which happens when the module is loaded), it fails - it is unable to detect the correct device.</p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">$ <span class="token function">sudo</span> modprobe it87
modprobe: ERROR: could not insert <span class="token string">'it87'</span><span class="token builtin class-name">:</span> No such device</code></pre></div><figcaption><p><span style="white-space: pre-wrap;">it87.ko cannot be loaded by modprobe with default parameters</span></p></figcaption></figure><p>Others have <a href="https://github.com/a1wong/it87/issues/1?ref=ghost.justin.palpant.us">run into this issue on a similar motherboard</a> - the it87 kernel module has an argument, <code class="language-text">force_id</code>, where you can specify the specific hardware configuration it should target. Though none of the available configurations is a perfect match for the Z390 (preventing automatic matching), some do, conveniently, match closely enough that specifying the ID manually results in successful access to the sensors. </p><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">$ <span class="token function">sudo</span> modprobe it87 <span class="token assign-left variable">force_id</span><span class="token operator">=</span>0x8628
$ <span class="token function">sudo</span> sensors-detect
<span class="token punctuation">..</span>.
Some Super I/O chips contain embedded sensors. We have to <span class="token function">write</span> to
standard I/O ports to probe them. This is usually safe.
Do you want to scan <span class="token keyword">for</span> Super I/O sensors? <span class="token punctuation">(</span>YES/no<span class="token punctuation">)</span>: 
Probing <span class="token keyword">for</span> Super-I/O at 0x2e/0x2f
Trying family <span class="token variable"><span class="token variable">`</span>National Semiconductor/ITE'<span class="token punctuation">..</span>.               No
Trying family <span class="token variable">`</span></span>SMSC<span class="token string">'...                                     No
Trying family `VIA/Winbond/Nuvoton/Fintek'</span><span class="token punctuation">..</span>.               No
Trying family `ITE'<span class="token punctuation">..</span>.                                      Yes
Found unknown chip with ID 0x8688
<span class="token punctuation">..</span>.

$ sensors
<span class="token punctuation">..</span>.
it8628-isa-0a40
Adapter: ISA adapter
in0:          +1.12 V  <span class="token punctuation">(</span>min <span class="token operator">=</span>  +0.00 V, max <span class="token operator">=</span>  +3.06 V<span class="token punctuation">)</span>
in1:          +2.00 V  <span class="token punctuation">(</span>min <span class="token operator">=</span>  +0.00 V, max <span class="token operator">=</span>  +3.06 V<span class="token punctuation">)</span>
in2:          +2.03 V  <span class="token punctuation">(</span>min <span class="token operator">=</span>  +0.00 V, max <span class="token operator">=</span>  +3.06 V<span class="token punctuation">)</span>
in3:          +2.02 V  <span class="token punctuation">(</span>min <span class="token operator">=</span>  +0.00 V, max <span class="token operator">=</span>  +3.06 V<span class="token punctuation">)</span>
in4:          +0.00 V  <span class="token punctuation">(</span>min <span class="token operator">=</span>  +0.00 V, max <span class="token operator">=</span>  +3.06 V<span class="token punctuation">)</span>  ALARM
in5:          +1.06 V  <span class="token punctuation">(</span>min <span class="token operator">=</span>  +0.00 V, max <span class="token operator">=</span>  +3.06 V<span class="token punctuation">)</span>
in6:          +1.21 V  <span class="token punctuation">(</span>min <span class="token operator">=</span>  +0.00 V, max <span class="token operator">=</span>  +3.06 V<span class="token punctuation">)</span>
3VSB:         +3.38 V  <span class="token punctuation">(</span>min <span class="token operator">=</span>  +0.00 V, max <span class="token operator">=</span>  +6.12 V<span class="token punctuation">)</span>
Vbat:         +3.19 V  
fan1:        <span class="token number">1496</span> RPM  <span class="token punctuation">(</span>min <span class="token operator">=</span>    <span class="token number">0</span> RPM<span class="token punctuation">)</span>
fan2:        <span class="token number">1541</span> RPM  <span class="token punctuation">(</span>min <span class="token operator">=</span>    <span class="token number">0</span> RPM<span class="token punctuation">)</span>
fan3:        <span class="token number">1464</span> RPM  <span class="token punctuation">(</span>min <span class="token operator">=</span>    <span class="token number">0</span> RPM<span class="token punctuation">)</span>
temp1:        +57.0°C  <span class="token punctuation">(</span>low  <span class="token operator">=</span> +127.0°C, high <span class="token operator">=</span> +127.0°C<span class="token punctuation">)</span>  sensor <span class="token operator">=</span> thermistor
temp2:        +64.0°C  <span class="token punctuation">(</span>low  <span class="token operator">=</span> +127.0°C, high <span class="token operator">=</span> +127.0°C<span class="token punctuation">)</span>  sensor <span class="token operator">=</span> thermistor
temp3:        +77.0°C  <span class="token punctuation">(</span>low  <span class="token operator">=</span> +127.0°C, high <span class="token operator">=</span> +127.0°C<span class="token punctuation">)</span>
temp4:         +0.0°C  <span class="token punctuation">(</span>low  <span class="token operator">=</span>  +0.0°C, high <span class="token operator">=</span> +127.0°C<span class="token punctuation">)</span>
temp5:        +65.0°C  <span class="token punctuation">(</span>low  <span class="token operator">=</span>  +0.0°C, high <span class="token operator">=</span> -120.0°C<span class="token punctuation">)</span>
temp6:        +63.0°C  <span class="token punctuation">(</span>low  <span class="token operator">=</span>  +0.0°C, high <span class="token operator">=</span> +127.0°C<span class="token punctuation">)</span>
intrusion0:  OK</code></pre></div><p>And just like that, I could see my fans speeds as well as a number of other sensors, and <code class="language-text">pwmconfig</code> was able to successfully detect the correct fan control PWM outputs.</p><p>To make this permanent, it's necessary to put the new kernel module into <code class="language-text">/etc/modules</code>, with the custom options in a separate conf file in <code class="language-text">/etc/modprobe.d</code>:</p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">dm-snapshot

<span class="token comment"># Generated by sensors-detect on Sun Jan 21 22:03:04 2018</span>
<span class="token comment"># Chip drivers</span>
coretemp

<span class="token comment"># Added manually, 2020-03-24, see hannesha/it87</span>
it87</code></pre></div><figcaption><p><code spellcheck="false" style="white-space: pre-wrap;"><span>/etc/modules</span></code><span style="white-space: pre-wrap;"> with it87 specified manually, coretemp found by </span><code spellcheck="false" style="white-space: pre-wrap;"><span>sensors-detect</span></code><span style="white-space: pre-wrap;">. Note that adding custom options here will not allow the module to be loaded on boot, and an error will be logged.</span></p></figcaption></figure><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash"><span class="token comment"># force kernel to assume IT87 module is similar to module 0x8628, even though it isn't</span>
<span class="token comment"># seems to work on Z390 I Pro Wifi</span>
options it87 <span class="token assign-left variable">force_id</span><span class="token operator">=</span>0x8628</code></pre></div><figcaption><p><code spellcheck="false" style="white-space: pre-wrap;"><span>/etc/modprobe.d/it87.conf</span></code></p></figcaption></figure><h3 id="gpu-temperature-fan-control">GPU temperature fan control</h3><p>Having <code class="language-text">fancontrol</code> control the case fans was great, and easier to modify than leaving fan control in the BIOS, but still didn't solve the original problem: I needed my case fan speed to depend on GPU temperature.</p><p>At this point <a href="https://unix.stackexchange.com/questions/499409/adjust-fan-speed-via-fancontrol-according-to-hard-disk-temperature-hddtemp?ref=ghost.justin.palpant.us">a StackOverflow post about connecting HDD temperatures</a> to <a href="https://unix.stackexchange.com/questions/499409/adjust-fan-speed-via-fancontrol-according-to-hard-disk-temperature-hddtemp?ref=ghost.justin.palpant.us"><code class="language-text">fancontrol</code></a> revealed that <code class="language-text">fancontrol</code> treats temperature sensors as simple files, so while it will by default read from <code class="language-text">/sys/class/hwmon/{sensorpath}</code>, you can also specify an arbitrary file path from <code class="language-text">/</code> as a sensor input in <code class="language-text">/etc/fancontrol</code>. This allows you to update a file with an arbitrary temperature and have <code class="language-text">fancontrol</code> use that file's content as if it were a sensor.</p><p>With a quick bash script that uses <code class="language-text">nvidia-smi</code> to read the fan temperature from multiple GPUs and write those values to files, and a systemd unit to run this as a process, I could create a <code class="language-text">fancontrol</code>-compatible "GPU-temperature sensor":</p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash"><span class="token shebang important">#!/bin/bash</span>
<span class="token comment"># Read NVIDIA GPU temperatures and write to a file on a duty cycle</span>

<span class="token assign-left variable">HELPTEXT</span><span class="token operator">=</span><span class="token string">"\
Export GPU temperatures to a directory. Each GPU is written to a file 'gpu_{gpu number}' in the directory.

Usage: export-gpu-temp --loop 2 --output /var/opt/gputemps --gpu 0 --gpu 1
Options:
  -o/--output (required) - path to a directory in which to write GPU temperatures
  -l/--loop (required) - time to sleep between GPU temperature query cycles, in seconds
  --gpu (required, multiple) - GPU number to query; repeat for multiple GPUs
"</span>

<span class="token builtin class-name">set</span> -eou pipefail

<span class="token assign-left variable">GPUS</span><span class="token operator">=</span><span class="token punctuation">(</span><span class="token punctuation">)</span>
<span class="token keyword">while</span> <span class="token punctuation">[</span><span class="token punctuation">[</span> <span class="token variable">$#</span> -gt <span class="token number">0</span> <span class="token punctuation">]</span><span class="token punctuation">]</span><span class="token punctuation">;</span> <span class="token keyword">do</span>
    <span class="token assign-left variable">key</span><span class="token operator">=</span><span class="token string">"<span class="token variable">$1</span>"</span>

    <span class="token keyword">case</span> <span class="token variable">$key</span> <span class="token keyword">in</span>
        -h<span class="token operator">|</span>--help<span class="token punctuation">)</span>
        <span class="token builtin class-name">echo</span> <span class="token string">"<span class="token variable">$HELPTEXT</span>"</span>
        <span class="token builtin class-name">exit</span> <span class="token number">0</span>
        <span class="token punctuation">;</span><span class="token punctuation">;</span>
        -o<span class="token operator">|</span>--output<span class="token punctuation">)</span>
        <span class="token assign-left variable">dirpath_output</span><span class="token operator">=</span><span class="token variable">$2</span>
        <span class="token keyword">if</span> <span class="token operator">!</span> <span class="token punctuation">[</span> -w <span class="token variable">$dirpath_output</span> <span class="token punctuation">]</span><span class="token punctuation">;</span> <span class="token keyword">then</span>
            <span class="token builtin class-name">echo</span> <span class="token string">"<span class="token variable">$dirpath_output</span> is not a writeable directory"</span>
            <span class="token builtin class-name">exit</span> -1
        <span class="token keyword">fi</span>
        <span class="token builtin class-name">shift</span>
        <span class="token builtin class-name">shift</span>
        <span class="token punctuation">;</span><span class="token punctuation">;</span>
        -l<span class="token operator">|</span>--loop<span class="token punctuation">)</span>
        <span class="token assign-left variable">loop_time</span><span class="token operator">=</span><span class="token variable">$2</span>
        <span class="token keyword">if</span> <span class="token punctuation">[</span><span class="token punctuation">[</span> <span class="token variable">$loop_time</span> <span class="token operator">&#x3C;</span> <span class="token number">0.1</span> <span class="token punctuation">]</span><span class="token punctuation">]</span><span class="token punctuation">;</span> <span class="token keyword">then</span>
            <span class="token builtin class-name">echo</span> <span class="token string">"loop_time is very small (<span class="token variable">${loop_time}</span>s), this may cause extra load on your GPU!"</span>
        <span class="token keyword">fi</span>
        <span class="token builtin class-name">shift</span>
        <span class="token builtin class-name">shift</span>
        <span class="token punctuation">;</span><span class="token punctuation">;</span>
        --gpu<span class="token punctuation">)</span>
        <span class="token assign-left variable">GPUS</span><span class="token operator">+=</span><span class="token punctuation">(</span><span class="token string">"<span class="token variable">$2</span>"</span><span class="token punctuation">)</span>
        <span class="token builtin class-name">shift</span>
        <span class="token builtin class-name">shift</span>
        <span class="token punctuation">;</span><span class="token punctuation">;</span>
        *<span class="token punctuation">)</span>
        <span class="token builtin class-name">echo</span> <span class="token string">"Unknown option <span class="token variable">$1</span>"</span>
        <span class="token builtin class-name">exit</span> -1
        <span class="token punctuation">;</span><span class="token punctuation">;</span>
    <span class="token keyword">esac</span>
<span class="token keyword">done</span>

<span class="token builtin class-name">echo</span> <span class="token string">"Querying GPUs: <span class="token variable">${GPUS<span class="token punctuation">[</span>@<span class="token punctuation">]</span>}</span>"</span>

<span class="token keyword">while</span> <span class="token boolean">true</span>
<span class="token keyword">do</span>
    <span class="token keyword">for</span> <span class="token for-or-select variable">gpu_id</span> <span class="token keyword">in</span> <span class="token variable">${GPUS<span class="token punctuation">[</span>@<span class="token punctuation">]</span>}</span>
    <span class="token keyword">do</span>
        <span class="token assign-left variable">gpu_output_path</span><span class="token operator">=</span><span class="token variable">${dirpath_output}</span>/gpu_<span class="token variable">${gpu_id}</span>

        <span class="token keyword">if</span> <span class="token operator">!</span> <span class="token assign-left variable">temp_degrees_c</span><span class="token operator">=</span><span class="token variable"><span class="token variable">$(</span>nvidia-smi --query-gpu<span class="token operator">=</span>temperature.gpu --format<span class="token operator">=</span>csv,noheader --id<span class="token operator">=</span>$gpu_id<span class="token variable">)</span></span><span class="token punctuation">;</span> <span class="token keyword">then</span>
            <span class="token builtin class-name">echo</span> <span class="token string">"Failed to fetch GPU <span class="token variable">${gpu_id}</span>"</span>
        <span class="token keyword">else</span>
            <span class="token assign-left variable">temp_millidegrees_c</span><span class="token operator">=</span><span class="token variable"><span class="token variable">$((</span>$temp_degrees_c <span class="token operator">*</span> <span class="token number">1000</span><span class="token variable">))</span></span>
            <span class="token builtin class-name">echo</span> <span class="token string">"<span class="token variable"><span class="token variable">$(</span><span class="token function">date</span> -Iseconds<span class="token variable">)</span></span> GPU <span class="token variable">${gpu_id}</span> has temperature <span class="token variable">${temp_degrees_c}</span>"</span>

            <span class="token builtin class-name">echo</span> <span class="token variable">$temp_millidegrees_c</span> <span class="token operator">></span> <span class="token variable">$gpu_output_path</span>
        <span class="token keyword">fi</span>
    <span class="token keyword">done</span>

    <span class="token builtin class-name">echo</span> <span class="token string">"<span class="token variable"><span class="token variable">$(</span><span class="token function">date</span> -Iseconds<span class="token variable">)</span></span> Sleeping <span class="token variable">${loop_time}</span>"</span>
    <span class="token function">sleep</span> <span class="token variable">$loop_time</span>
<span class="token keyword">done</span>
</code></pre></div><figcaption><p><code spellcheck="false" style="white-space: pre-wrap;"><span>export-gpu-temp</span></code><span style="white-space: pre-wrap;">, a Bash script to write one or multiple GPU temperatures to individual files, to mimic a hwmon sensor</span></p></figcaption></figure><p>Note that <code class="language-text">fancontrol</code> expects temperatures to be providing in millidegrees Celsius, following the <a href="https://www.kernel.org/doc/Documentation/hwmon/sysfs-interface?ref=ghost.justin.palpant.us">hmwon interface</a>, so the output from <code class="language-text">nvidia-smi</code> needed to be multiplied by 1000.</p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="systemd"><pre class="language-systemd"><code class="language-systemd"><span class="token section"><span class="token punctuation">[</span><span class="token section-name selector">Unit</span><span class="token punctuation">]</span></span>
<span class="token key attr-name">Description</span><span class="token punctuation">=</span><span class="token value attr-value">Export GPU temperatures to a file continuously</span>
<span class="token key attr-name">Documentation</span><span class="token punctuation">=</span>

<span class="token section"><span class="token punctuation">[</span><span class="token section-name selector">Service</span><span class="token punctuation">]</span></span>
<span class="token key attr-name">Type</span><span class="token punctuation">=</span><span class="token value attr-value">simple</span>
<span class="token key attr-name">ExecStart</span><span class="token punctuation">=</span><span class="token value attr-value">/usr/local/bin/export-gpu-temp --gpu 0 --output /var/opt/fancontrol/ --loop 1</span>
<span class="token key attr-name">Restart</span><span class="token punctuation">=</span><span class="token value attr-value">on-failure</span>

<span class="token section"><span class="token punctuation">[</span><span class="token section-name selector">Install</span><span class="token punctuation">]</span></span>
<span class="token key attr-name">WantedBy</span><span class="token punctuation">=</span><span class="token value attr-value">multi-user.target</span>
</code></pre></div><figcaption><p><span style="white-space: pre-wrap;">A systemd Unit to export temperatures from GPU 0 to /var/opt/fancontrol/gpu_0 every second.</span></p></figcaption></figure><p>With that systemd unit up and running, it was a simple matter to modify <code class="language-text">/etc/fancontrol</code> manually to point to the correct "hardware sensor" and establish temperature bounds for the two case fans. I chose to have the case fans shut off when the GPU temperature was below 60<strong>°</strong>C, and to reach max speed at 80<strong>°</strong>C. Here <code class="language-text">hwmon3/pwm2</code> and <code class="language-text">hwmon3/pwm3</code> are the two case fans. <code class="language-text">hwmon3/pwm1</code> is the CPU fan, and is tied to <code class="language-text">hwmon2/temp2_input</code>, which is the temperature of the first CPU core.</p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash"><span class="token assign-left variable">INTERVAL</span><span class="token operator">=</span><span class="token number">1</span>
<span class="token assign-left variable">DEVPATH</span><span class="token operator">=</span>hwmon2<span class="token operator">=</span>devices/platform/coretemp.0 <span class="token assign-left variable">hwmon3</span><span class="token operator">=</span>devices/platform/it87.2624
<span class="token assign-left variable">DEVNAME</span><span class="token operator">=</span>hwmon2<span class="token operator">=</span>coretemp <span class="token assign-left variable">hwmon3</span><span class="token operator">=</span>it8628
<span class="token assign-left variable">FCTEMPS</span><span class="token operator">=</span>hwmon3/pwm3<span class="token operator">=</span>/var/opt/fancontrol/gpu_0 hwmon3/pwm2<span class="token operator">=</span>/var/opt/fancontrol/gpu_0 hwmon3/pwm1<span class="token operator">=</span>hwmon2/temp2_input
<span class="token assign-left variable">FCFANS</span><span class="token operator">=</span>hwmon3/pwm3<span class="token operator">=</span>hwmon3/fan3_input hwmon3/pwm2<span class="token operator">=</span>hwmon3/fan2_input hwmon3/pwm1<span class="token operator">=</span>hwmon3/fan1_input
<span class="token assign-left variable">MINTEMP</span><span class="token operator">=</span><span class="token number">50</span> hwmon3/pwm3<span class="token operator">=</span><span class="token number">60</span> hwmon3/pwm2<span class="token operator">=</span><span class="token number">60</span> hwmon3/pwm1<span class="token operator">=</span><span class="token number">60</span>
<span class="token assign-left variable">MAXTEMP</span><span class="token operator">=</span><span class="token number">50</span> hwmon3/pwm3<span class="token operator">=</span><span class="token number">80</span> hwmon3/pwm2<span class="token operator">=</span><span class="token number">80</span> hwmon3/pwm1<span class="token operator">=</span><span class="token number">95</span>
<span class="token assign-left variable">MINSTART</span><span class="token operator">=</span><span class="token number">20</span> hwmon3/pwm3<span class="token operator">=</span><span class="token number">20</span> hwmon3/pwm2<span class="token operator">=</span><span class="token number">20</span> hwmon3/pwm1<span class="token operator">=</span><span class="token number">56</span>
<span class="token assign-left variable">MINSTOP</span><span class="token operator">=</span><span class="token number">0</span> hwmon3/pwm3<span class="token operator">=</span><span class="token number">0</span> hwmon3/pwm2<span class="token operator">=</span><span class="token number">0</span> hwmon3/pwm1<span class="token operator">=</span><span class="token number">16</span>
<span class="token assign-left variable">MINPWM</span><span class="token operator">=</span><span class="token number">0</span> hwmon3/pwm3<span class="token operator">=</span><span class="token number">0</span> hwmon3/pwm2<span class="token operator">=</span><span class="token number">0</span> hwmon3/pwm1<span class="token operator">=</span><span class="token number">16</span>
<span class="token assign-left variable">MAXPWM</span><span class="token operator">=</span><span class="token number">230</span> hwmon3/pwm3<span class="token operator">=</span><span class="token number">250</span> hwmon3/pwm2<span class="token operator">=</span><span class="token number">250</span> hwmon3/pwm1<span class="token operator">=</span><span class="token number">250</span>
<span class="token assign-left variable">AVERAGE</span><span class="token operator">=</span><span class="token number">5</span></code></pre></div><figcaption><p><span style="white-space: pre-wrap;">Final </span><code spellcheck="false" style="white-space: pre-wrap;"><span>/etc/fancontrol</span></code><span style="white-space: pre-wrap;">. Read more about the available options on the </span><a href="https://linux.die.net/man/8/fancontrol?ref=ghost.justin.palpant.us"><span style="white-space: pre-wrap;">fancontrol man page</span></a></p></figcaption></figure><p>With the <code class="language-text">it87</code> kernel module, <code class="language-text">fancontrol</code>, and this script, I believed I was in a good place and would resolve throttling with sensical fan control<strong>. </strong>GPU temperatures were noticeably lower under load, so it was time to check <code class="language-text">-d PERFORMANCE</code>.</p><h3 id="sw-power-throttle">SW Power Throttle</h3><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">nvidia-smi -q -d PERFORMANCE

<span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span>NVSMI <span class="token assign-left variable">LOG</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span>

Driver Version                      <span class="token builtin class-name">:</span> <span class="token number">440.66</span>.08
CUDA Version                        <span class="token builtin class-name">:</span> <span class="token number">10.2</span>

Attached GPUs                       <span class="token builtin class-name">:</span> <span class="token number">1</span>
GPU 00000000:01:00.0
    Performance State               <span class="token builtin class-name">:</span> P2
    Clocks Throttle Reasons
        Idle                        <span class="token builtin class-name">:</span> Not Active
        Applications Clocks Setting <span class="token builtin class-name">:</span> Not Active
        SW Power Cap                <span class="token builtin class-name">:</span> Active
        HW Slowdown                 <span class="token builtin class-name">:</span> Not Active
            HW Thermal Slowdown     <span class="token builtin class-name">:</span> Not Active
            HW Power Brake Slowdown <span class="token builtin class-name">:</span> Not Active
        Sync Boost                  <span class="token builtin class-name">:</span> Not Active
        SW Thermal Slowdown         <span class="token builtin class-name">:</span> Not Active
        Display Clock Setting       <span class="token builtin class-name">:</span> Not Active</code></pre></div><p>After all that work to fix the cooling problem, one new problem had developed: this GPU has a <a href="https://www.techpowerup.com/gpu-specs/geforce-rtx-2080-super.c3439?ref=ghost.justin.palpant.us">TDP of 250W</a>. At full throttle and when properly cooled, that wasn't enough power. Fortunately, power limit controls are available in <code class="language-text">nvidia-smi</code>. We can check what power range is appropriate for the GPU with <code class="language-text">nvidia-smi -q -d POWER</code>:</p><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">nvidia-smi -q -d POWER
<span class="token punctuation">..</span>.
        Power Limit                 <span class="token builtin class-name">:</span> <span class="token number">250.00</span> W
        Default Power Limit         <span class="token builtin class-name">:</span> <span class="token number">250.00</span> W
        Enforced Power Limit        <span class="token builtin class-name">:</span> <span class="token number">250.00</span> W
        Min Power Limit             <span class="token builtin class-name">:</span> <span class="token number">125.00</span> W
        Max Power Limit             <span class="token builtin class-name">:</span> <span class="token number">292.00</span> W</code></pre></div><p>Which shows that even though the reference power limit is 250W, this can be easily configured as high as 292W and as low as 125W. </p><p>To change the power limit, run <code class="language-text">nvidia-smi -pl $PL_IN_WATTS</code> as a superuser. Note, you may need to enable power control on the GPU with <code class="language-text">nvidia-smi -pm 1</code>. <a href="https://bitcointalk.org/index.php?topic=2848723.0&#x26;ref=ghost.justin.palpant.us">This great blog post</a> has more details, and also includes a quick introduction to overclocking an NVIDIA GPU on Linux, for the interested.</p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash"><span class="token function">sudo</span> nvidia-smi -pl <span class="token number">292</span>
nvidia-smi -q -d POWER
<span class="token punctuation">..</span>.
        Power Limit                 <span class="token builtin class-name">:</span> <span class="token number">292.00</span> W
        Default Power Limit         <span class="token builtin class-name">:</span> <span class="token number">250.00</span> W
        Enforced Power Limit        <span class="token builtin class-name">:</span> <span class="token number">292.00</span> W
        Min Power Limit             <span class="token builtin class-name">:</span> <span class="token number">125.00</span> W
        Max Power Limit             <span class="token builtin class-name">:</span> <span class="token number">292.00</span> W
</code></pre></div><figcaption><p><span style="white-space: pre-wrap;">Modify NVDIA GPU power limits on Linux with </span><code spellcheck="false" style="white-space: pre-wrap;"><span>nvidia-smi -pl</span></code></p></figcaption></figure><h2 id="results">Results</h2><p>With the maximum power increased, fans installed and properly controlled, the GPU now runs at a comfortable 72-75<strong>°</strong>C, and the SM clock frequency remained <a href="https://gitlab.palpant.us/grafana/dashboard/snapshot/nIUkAAOLArzoOAPzQuZcYXFwI6FraQs2?ref=ghost.justin.palpant.us">stable at 1890MHz</a>** for long intervals.</p>
<!--kg-card-begin: html-->
<iframe title="GPU frequency graph without throttling" src="https://grafana.palpant.us/dashboard-solo/snapshot/nIUkAAOLArzoOAPzQuZcYXFwI6FraQs2?orgId=1&#x26;from=1587398962225&#x26;to=1587438495381&#x26;panelId=6" style="width:100%" height="400" frameborder="0"/>
<!--kg-card-end: html-->
<p><code class="language-text">nvidia-smi</code> no longer indicates any form of throttling is occurring:</p><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">nvidia-smi -q -d PERFORMANCE

<span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span>NVSMI <span class="token assign-left variable">LOG</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span><span class="token operator">==</span>

Timestamp                           <span class="token builtin class-name">:</span> Sat Apr <span class="token number">25</span> <span class="token number">13</span>:04:02 <span class="token number">2020</span>
Driver Version                      <span class="token builtin class-name">:</span> <span class="token number">440.66</span>.08
CUDA Version                        <span class="token builtin class-name">:</span> <span class="token number">10.2</span>

Attached GPUs                       <span class="token builtin class-name">:</span> <span class="token number">1</span>
GPU 00000000:01:00.0
    Performance State               <span class="token builtin class-name">:</span> P0
    Clocks Throttle Reasons
        Idle                        <span class="token builtin class-name">:</span> Not Active
        Applications Clocks Setting <span class="token builtin class-name">:</span> Not Active
        SW Power Cap                <span class="token builtin class-name">:</span> Not Active
        HW Slowdown                 <span class="token builtin class-name">:</span> Not Active
            HW Thermal Slowdown     <span class="token builtin class-name">:</span> Not Active
            HW Power Brake Slowdown <span class="token builtin class-name">:</span> Not Active
        Sync Boost                  <span class="token builtin class-name">:</span> Not Active
        SW Thermal Slowdown         <span class="token builtin class-name">:</span> Not Active
        Display Clock Setting       <span class="token builtin class-name">:</span> Not Active</code></pre></div><p>But the real test is in the benchmarks. While I somewhat unreliably observed higher <a href="https://foldingathome.org/iamoneinamillion/?ref=ghost.justin.palpant.us">Folding@Home</a> Points Per Day, I decided to test a more accurate benchmark with Phoronix Test Suite:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://ghost.justin.palpant.us/content/images/2020/04/Screenshot-2020-04-21-18.11.38.png" class="kg-image" alt="Maximizing NVIDIA GPU performance on Linux" loading="lazy" width="1980" height="756" srcset="https://ghost.justin.palpant.us/content/images/size/w600/2020/04/Screenshot-2020-04-21-18.11.38.png 600w, https://ghost.justin.palpant.us/content/images/size/w1000/2020/04/Screenshot-2020-04-21-18.11.38.png 1000w, https://ghost.justin.palpant.us/content/images/size/w1600/2020/04/Screenshot-2020-04-21-18.11.38.png 1600w, https://ghost.justin.palpant.us/content/images/2020/04/Screenshot-2020-04-21-18.11.38.png 1980w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Benchmarking with Phoronix Test Suite's </span><a href="https://openbenchmarking.org/test/pts/unigine-heaven-1.6.4?ref=ghost.justin.palpant.us"><span style="white-space: pre-wrap;">pts/unigine-heaven</span></a><span style="white-space: pre-wrap;"> benchmark. Full result </span><a href="https://openbenchmarking.org/result/2003228-JUST-191230635?ref=ghost.justin.palpant.us#r-3ae55c63f1481e2e8194b34a4a304a3d8ad11d0a"><span style="white-space: pre-wrap;">here</span></a><span style="white-space: pre-wrap;">, my old GTX 1050Ti for reference.</span></figcaption></img></figure><p>An increase of 25FPS, or ~15%, is nothing to sneeze at! It's not huge, but it's approximately the difference between adjacent grades of graphics cards these days, so this felt like getting a free upgrade.</p><h2 id="summary">Summary</h2><p>Check for GPU throttling with <code class="language-text">nvidia-smi -q -d PERFORMANCE --loop-ms=500</code>. If thermal throttling occurs, consider improving cooling with better fans, additional case fans, or, failing that, a liquid-cooling system. If no thermal throttling is happening, don't waste time or money on a complex cooling setup! If you encounter hardware power throttling, you may need to buy a more powerful power supply. If software-defined power throttling is happening, try to change the software-defined power limits by checking the acceptable power range with <code class="language-text">nvidia-smi -q -d POWER</code> and setting the active limits with <code class="language-text">nvidia-smi -pl</code>.</p><p>As an added benefit, I find that it's easy to use the software-defined power limit as a cheap GPU throttle: reducing the power limit to 150W makes the GPU run cool, at the cost of about half the performance.</p><p><em>*The Performance State indicator is also interesting, and you can read more about it in </em><a href="https://docs.nvidia.com/gameworks/content/gameworkslibrary/coresdk/nvapi/group__gpupstate.html?ref=ghost.justin.palpant.us"><em>the NVIDIA docs</em></a><em>. According to </em><a href="https://www.reddit.com/r/RenderToken/comments/9w2rd9/how_to_use_maximum_p0_power_state_with_nvidia?ref=ghost.justin.palpant.us"><em>this Reddit post</em></a><em>, P0-P2 power states have identical core clock frequencies, but P2 reduces the memory clock frequency. It also states that all compute other than live graphical rendering will keep the card in the P2 state. Since my memory utilization is low, this isn't a problem, but if memory bandwidth or utilization is a concern, consider addressing a decreased power state.</em></p><p><em>**I have since managed to successfully overclock the SM frequency by +100MHz, stably, and now see constant frequencies at 1995MHz without thermal, power, or other stability issues. But the benchmark and plots show the state of the system and the performance gains without any overclocking.</em></p>]]></content:encoded></item><item><title><![CDATA[Understanding btrfs on Ubuntu - An introduction to btrfs]]></title><description><![CDATA[What I've learned trying btrfs, the next-generation filesystem, on Ubuntu.]]></description><link>https://justin.palpant.us/btrfs-on-ubuntu-part-1/</link><guid isPermaLink="false">Ghost__Post__604dbf89a33ad9000707c3e2</guid><category><![CDATA[btrfs]]></category><category><![CDATA[linux]]></category><category><![CDATA[tech]]></category><dc:creator><![CDATA[Justin Palpant]]></dc:creator><pubDate>Mon, 12 Apr 2021 16:00:00 GMT</pubDate><media:content url="https://justin.palpant.us/static/eeb5700129d2a72695f9db10c91dec66/B-tree.svg" medium="image"/><content:encoded><![CDATA[<img src="https://justin.palpant.us/static/eeb5700129d2a72695f9db10c91dec66/B-tree.svg" alt="Understanding btrfs on Ubuntu - An introduction to btrfs"/><p>With a recent reinstall of Ubuntu 20.04 on my personal computer, I decided that I wanted to explore a new type of filesystem, and find a replacement for how I used the traditional ext4 and LVM - and I settled on <a href="https://en.wikipedia.org/wiki/Btrfs?ref=ghost.justin.palpant.us">btrfs</a>. While I was and am excited about the power of this next-generation filesystem, I've discovered that it doesn't run itself - with powerful features comes complexity.</p><h1 id="background">Background</h1><p>The computer I am testing on has four disks: one 1TB, one 2TB SSD, and two additional 512GB NVMe SSDs. The 1TB and 2TB disks are managed by a hardware RAID controller.</p><p>It's a personal computer used for gaming and daily use, as well as a place I experiment. It runs a small single-node Kubernetes cluster which I use for automated builds of some of my repos, for running <a href="https://foldingathome.org/?ref=ghost.justin.palpant.us">Folding@Home</a>, and several other services. You can learn more about this machine and my other infrastructure from the <a href="https://gitlab.palpant.us/justin/palpantlab-infra/-/blob/master/README.md?ref=ghost.justin.palpant.us#palpantlab-sfo">README</a> for my homelab's repository.</p><p>During the reinstall, I also chose to dual-boot Windows and Linux, splitting the main disk in two: 200GB for the Ubuntu root (<code class="language-text">/</code>) and 800GB for Windows. The two 512GB drives don't have a hardware RAID controller, but I wanted to use those in RAID1 for the <code class="language-text">/home</code> folder in Ubuntu.</p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">$ <span class="token function">sudo</span> lsblk
sda           <span class="token number">8</span>:0    <span class="token number">0</span>   <span class="token number">1</span>.8T  <span class="token number">0</span> disk  
└─md126       <span class="token number">9</span>:126  <span class="token number">0</span> <span class="token number">931</span>.5G  <span class="token number">0</span> raid1 
  ├─md126p1 <span class="token number">259</span>:2    <span class="token number">0</span>   100M  <span class="token number">0</span> part  /boot/efi
  ├─md126p2 <span class="token number">259</span>:3    <span class="token number">0</span>    16M  <span class="token number">0</span> part  
  ├─md126p3 <span class="token number">259</span>:4    <span class="token number">0</span> <span class="token number">735</span>.6G  <span class="token number">0</span> part  <span class="token operator">&#x3C;</span>-- Windows 
  ├─md126p4 <span class="token number">259</span>:5    <span class="token number">0</span>   499M  <span class="token number">0</span> part  
  └─md126p5 <span class="token number">259</span>:6    <span class="token number">0</span> <span class="token number">195</span>.3G  <span class="token number">0</span> part  /
sdb           <span class="token number">8</span>:16   <span class="token number">0</span> <span class="token number">931</span>.5G  <span class="token number">0</span> disk  
└─md126       <span class="token number">9</span>:126  <span class="token number">0</span> <span class="token number">931</span>.5G  <span class="token number">0</span> raid1 
  ├─md126p1 <span class="token number">259</span>:2    <span class="token number">0</span>   100M  <span class="token number">0</span> part  /boot/efi
  ├─md126p2 <span class="token number">259</span>:3    <span class="token number">0</span>    16M  <span class="token number">0</span> part  
  ├─md126p3 <span class="token number">259</span>:4    <span class="token number">0</span> <span class="token number">735</span>.6G  <span class="token number">0</span> part  <span class="token operator">&#x3C;</span>-- Windows  
  ├─md126p4 <span class="token number">259</span>:5    <span class="token number">0</span>   499M  <span class="token number">0</span> part  
  └─md126p5 <span class="token number">259</span>:6    <span class="token number">0</span> <span class="token number">195</span>.3G  <span class="token number">0</span> part  /
nvme0n1     <span class="token number">259</span>:0    <span class="token number">0</span> <span class="token number">465</span>.8G  <span class="token number">0</span> disk  /home
nvme1n1     <span class="token number">259</span>:1    <span class="token number">0</span> <span class="token number">465</span>.8G  <span class="token number">0</span> disk  </code></pre></div><figcaption>lsblk output showing RAID disks and NVME additions disks</figcaption></figure><p>I also have spent a significant amount of time in the past testing different systems of backup and restore for this machine, out of curiosity, including Ubuntu's built-in <a href="https://wiki.gnome.org/Apps/DejaDup?ref=ghost.justin.palpant.us">Deja Dup</a>, <a href="https://duplicacy.com/?ref=ghost.justin.palpant.us">duplicacy</a>, and full-disk backups with <code class="language-text">dd</code> and <code class="language-text">tar</code>, and looked forward to some of the features btrfs provides to make backups easier.</p><p>So, with these goals in mind, what are some of the features btrfs provides to make this happen?</p><h1 id="btrfs-a-next-generation-filesystem">btrfs - a next-generation filesystem</h1><p>btrfs was developed beginning in 2007, and merged into the Linux kernel mainline in 2009, and declared stable in 2013. The name has <a href="https://en.wikipedia.org/wiki/Btrfs?ref=ghost.justin.palpant.us">various pronunciations</a>, and is a reference to the filesystem's core data structure: a <a href="https://en.wikipedia.org/wiki/Copy-on-write?ref=ghost.justin.palpant.us">copy-on-write</a> (COW) B-tree.</p><p>Built-in to btrfs are a number of features you may not expect the filesystem to provide for you. Commonly-desired features that Linux users would install additional software for have been built into the filesystem from the beginning.</p><h3 id="device-management">Device management</h3><p>btrfs provides users some ways to manage physical devices directly, like LVM, as well as providing some support for software-RAID arrangements, like mdadm. Though not meant to compete feature-for-feature, btrfs supports:</p><ul><li>Creating a filesystem with metadata, data, or both in RAID 0, RAID 1, RAID 10, RAID 5 and RAID 6</li><li>Performing day-2 addition of disks to an existing filesystem, and conversion between RAID levels</li><li>Resizing the filesystem to take advantage added disks, or decrease the size of the filesystem to account for lost disks</li><li>Balancing data across disks, including when replacing disks due to failure</li></ul><p>btrfs doesn't provide complex volume group and logical device management like LVM - instead, all devices are joined into a common pool, and individual pieces of data are arranged onto the storage pool according to RAID configuration.</p><p>However, though devices are pooled within each btrfs filesystem, it is possible to have multiple <em>filesystems</em> active, and the command <code class="language-text">btrfs device scan</code> searches all devices for distinct btrfs filesystems.</p><p>The Ubuntu installer creates one btrfs filesystem for the root directory. I placed this on a 200GB partition of the hardware RAID controlled pair of disks. I moved the home directory to a second filesystem using btrfs' RAID:</p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash"><span class="token function">sudo</span> mkfs.btrfs -m raid1 -d raid1 /dev/nvme0n1 /dev/nvme1n1
<span class="token function">sudo</span> <span class="token function">mount</span> /dev/nvme0n1 /home</code></pre></div><figcaption>Creating a second btrfs filesystetm with software RAID1 for /home</figcaption></figure><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">$ <span class="token function">sudo</span> btrfs filesystem show
Label: none  uuid: 3451815e-07c2-4b60-bd43-68fd338aa881
        Total devices <span class="token number">1</span> FS bytes used <span class="token number">172</span>.82GiB
        devid    <span class="token number">1</span> size <span class="token number">195</span>.31GiB used <span class="token number">177</span>.03GiB path /dev/md126p5

Label: none  uuid: af5e3ee6-40c6-4dc0-82f3-5f6a025f842c
        Total devices <span class="token number">2</span> FS bytes used <span class="token number">49</span>.55GiB
        devid    <span class="token number">1</span> size <span class="token number">465</span>.76GiB used <span class="token number">83</span>.03GiB path /dev/nvme0n1
        devid    <span class="token number">2</span> size <span class="token number">465</span>.76GiB used <span class="token number">83</span>.03GiB path /dev/nvme1n1</code></pre></div><h3 id="subvolumes">Subvolumes</h3><p>btrfs allows users to create multiple subvolumes within a filesystem. btrfs subvolumes resemble as folders within the filesystem, and can be nested within each other. The mounted filesystem within which you create the subvolume is the subvolume's parent. Mounting a parent subvolume implicitly mounts the child subvolumes at their path. Each subvolume has a UUID, a unique ID, and is also uniquely identified by its <code class="language-text">name</code>, which is also the path at which it will appear under its parent. Because this <code class="language-text">name</code> is also the subvolume's path, moving a subvolume is the same as renaming it.</p><p>However, subvolumes differ from folders in a number of ways:</p><ul><li>Subvolumes can be individually mounted at another location</li><li>Subvolumes are globally queryable with <code class="language-text">btrfs subvolume list &#x3C;path></code></li></ul><p>All btrfs filesystems have at least one subvolume: this is the root subvolume, and by convention has ID=5 and the hidden name <code class="language-text">FS_TREE</code>. This root subvolume is not included in <code class="language-text">btrfs subvolume list</code>, but in filesystems with additional subvolumes, it will be the ultimate parent of any child subvolumes. The <code class="language-text">top level</code> field of <code class="language-text">btrfs subvolume list</code> indicates the ID of the parent subvolume. With <code class="language-text">-a</code>, the <code class="language-text">path</code> field shows the name of the parent as the first part of the path.</p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">$ <span class="token function">sudo</span> btrfs subvolume list -atq /
ID      gen     <span class="token function">top</span> level       parent_uuid     path
--      ---     ---------       -----------     ----
<span class="token number">256</span>     <span class="token number">336674</span>  <span class="token number">5</span>               -               <span class="token operator">&#x3C;</span>FS_TREE<span class="token operator">></span>/@</code></pre></div><figcaption>Ubuntu subvolumes for the root filesystem on OS install</figcaption></figure><p>Any subvolume within the filesystem can be mounted as the root, not just <code class="language-text">FS_TREE</code>. Ubuntu, in fact, creates a child subvolume with the name <code class="language-text">@</code> and mounts this subvolume to the path <code class="language-text">/</code> instead of mounting <code class="language-text">FS_TREE</code> to that location.</p><p>For more help understanding subvolumes and when to use them, check out the btrfs <a href="https://btrfs.wiki.kernel.org/index.php/SysadminGuide?ref=ghost.justin.palpant.us#Subvolumes">SysadminGuide</a>.</p><h3 id="snapshots">Snapshots</h3><p>Because btrfs is copy-on-write, it supports lightweight <a href="https://btrfs.wiki.kernel.org/index.php/SysadminGuide?ref=ghost.justin.palpant.us#Snapshots">snapshots</a> which capture the current state and only record the changes made to the filesystem since the previous snapshot. </p><p>These snapshots are actually new subvolumes, identical to plain subvolumes except that they are populated with the content of their source at creation. This is done without consuming any space initially because the new subvolume simply references the data without making a new copy. Snapshots live within the filesystem as subvolumes, and are mountable and browseable like others. </p><p>Read-only snapshots can maintain the state of the filesystem at a fixed point in time, while read-write snapshots can restore such a snapshot and be used to recover. </p><h3 id="compression">Compression</h3><p>On top of these features, btrfs also supports automatic compression using one of several algorithms: <u>zlib</u>, <u>lzo</u>, and <u>zstd</u>. Compression can be activated at mount-time, either using a btrfs-designed heuristic for when files are compressive by specifying <code class="language-text">-o compress</code> (though some users do not recommend this approach), or on all files by specifying <code class="language-text">-o compress-force</code>. You can also use extended attributes to enable or disable compression on individual files using the command <code class="language-text">btrfs property set &#x3C;file> compression ...</code> or <code class="language-text">chattr +c</code>.</p><hr><blockquote>Credit for the title image to CyHawk - Own work based on [1]., CC BY-SA 3.0, <a href="https://commons.wikimedia.org/w/index.php?curid=11701365&#x26;ref=ghost.justin.palpant.us">https://commons.wikimedia.org/w/index.php?curid=11701365</a></blockquote></hr>]]></content:encoded></item><item><title><![CDATA[Folding@Home on Kubernetes]]></title><description><![CDATA[Donate extra compute resources to investigate COVID-19 using Kubernetes.]]></description><link>https://justin.palpant.us/folding-home-on-kubernetes/</link><guid isPermaLink="false">Ghost__Post__5e707ad1434787000756fd29</guid><category><![CDATA[tech]]></category><category><![CDATA[kubernetes]]></category><category><![CDATA[prometheus]]></category><category><![CDATA[grafana]]></category><category><![CDATA[covid-19]]></category><category><![CDATA[coronavirus]]></category><dc:creator><![CDATA[Justin Palpant]]></dc:creator><pubDate>Tue, 17 Mar 2020 16:00:00 GMT</pubDate><media:content url="https://justin.palpant.us/static/c5c9715c185f4c553d124ffe4abd029d/2019-nCoV-CDC-23311-progressive.jpeg" medium="image"/><content:encoded><![CDATA[<img src="https://justin.palpant.us/static/c5c9715c185f4c553d124ffe4abd029d/2019-nCoV-CDC-23311-progressive.jpeg" alt="Folding@Home on Kubernetes"/><p><em>Update: I now publish <a href="https://quay.io/repository/jpalpant/fah-client?tab=info&#x26;ref=ghost.justin.palpant.us">jpalpant/fah-client</a> if you are looking for a thin, GPU-supporting wrapper around FAHClient until an official one is released, as well as <a href="https://hub.docker.com/r/jpalpant/folding-exporter?ref=ghost.justin.palpant.us">jpalpant/folding-exporter</a>, a Prometheus exporter for tracking F@H PPD.</em></p><p><a href="https://foldingathome.org/?ref=ghost.justin.palpant.us">Folding@Home</a> (F@H or FAH) is an incredible distributed computing project that lets individuals sign up to donate extra compute resources to researchers solving problems that would otherwise only be accessible to those with the most powerful of supercomputers. It uses those resources by assigning each computer small pieces of incredibly complex molecular simulations, and then assembling the results when each computer is finished*. With the power of hundreds of thousands of users, <a href="https://en.wikipedia.org/wiki/Folding@home?ref=ghost.justin.palpant.us#Performance">F@H</a> rivals <a href="https://en.wikipedia.org/wiki/TOP500?ref=ghost.justin.palpant.us#TOP_500">the fastest supercomputers in the world today</a>.</p><p>Having recently acquired a new GPU for my PC that was sitting very quiet most of the day, I was happy to learn that F@H was still running strong after all these years. Since my homelab <a href="https://gitlab.palpant.us/justin/palpantlab-infra/-/blob/master/README.md?ref=ghost.justin.palpant.us#palpantlab-sfo">includes a single-node Kubernetes cluster</a> running on that PC, I decided to use it to see if I could run F@H. There are <a href="https://foldingathome.org/start-folding/?ref=ghost.justin.palpant.us">much easier ways to install F@H</a> on your system, but I was happy to have another chance to use my build and deploy system.</p><p>This was a complete coincidence (as far as I remember), but I decided to try this on March 5th, about a week after F@H started to work on simulating COVID-19:</p><!--kg-card-begin: html--><div class="row">
	<blockquote class="twitter-tweet tw-align-center" data-theme="dark"><p lang="en" dir="ltr">Help us in the fight against COVID-19! Download the app at: <a href="https://t.co/andJ4PDzVl?ref=ghost.justin.palpant.us">https://t.co/andJ4PDzVl</a> <a href="https://twitter.com/hashtag/Coronavirus?src=hash&#x26;ref_src=twsrc%5Etfw&#x26;ref=ghost.justin.palpant.us">#Coronavirus</a> <a href="https://twitter.com/hashtag/2019nCov?src=hash&#x26;ref_src=twsrc%5Etfw&#x26;ref=ghost.justin.palpant.us">#2019nCov</a> <a href="https://twitter.com/hashtag/COVID19?src=hash&#x26;ref_src=twsrc%5Etfw&#x26;ref=ghost.justin.palpant.us">#COVID19</a> <a href="https://twitter.com/hashtag/SARSCoV2?src=hash&#x26;ref_src=twsrc%5Etfw&#x26;ref=ghost.justin.palpant.us">#SARSCoV2</a> <a href="https://t.co/BSmiV8phh1?ref=ghost.justin.palpant.us">https://t.co/BSmiV8phh1</a></p>— Folding@home (@foldingathome) <a href="https://twitter.com/foldingathome/status/1233150565347016706?ref_src=twsrc%5Etfw&#x26;ref=ghost.justin.palpant.us">February 27, 2020</a></blockquote>
	<script async="" src="https://platform.twitter.com/widgets.js" charset="utf-8"/>
</div><!--kg-card-end: html--><h3 id="spinning-up">Spinning up</h3><p>The first stage for most of these small projects has 3 parts:</p><ol><li>Decide if I need to build a custom Docker image or if I can reuse an existing one</li><li>Decide what Kubernetes objects I'll need in the deployment</li><li>Bootstrap the git repo, <a href="https://gitlab.palpant.us/justin/folding?ref=ghost.justin.palpant.us">GitLab project</a>, <a href="https://gitlab.palpant.us/justin/folding/-/blob/master/.gitlab-ci.yml?ref=ghost.justin.palpant.us">build YAML</a>, and "extras" (DNS, manual configs, healthchecks).</li></ol><p>I settled on a custom Docker image after checking out a few that were available - most didn't leave me enough control over initialization, or didn't allow enough customization. </p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="dockerfile"><pre class="language-dockerfile"><code class="language-dockerfile"><span class="token instruction"><span class="token keyword">FROM</span> nvidia/opencl:devel-ubuntu18.04</span>

<span class="token instruction"><span class="token keyword">LABEL</span> maintainer=<span class="token string">"justin@palpant.us"</span></span>

<span class="token instruction"><span class="token keyword">ARG</span> FAH_VERSION_MAJOR=7</span>
<span class="token instruction"><span class="token keyword">ARG</span> FAH_VERSION_MINOR=5</span>
<span class="token instruction"><span class="token keyword">ARG</span> FAH_VERSION_PATCH=1</span>

<span class="token instruction"><span class="token keyword">ENV</span> DEBIAN_FRONTEND=noninteractive</span>

<span class="token instruction"><span class="token keyword">RUN</span> apt-get update &#x26;&#x26; apt-get install --no-install-recommends -y <span class="token operator">\</span>
        ca-certificates wget bzip2 dumb-init &#x26;&#x26;<span class="token operator">\</span>
        wget https://download.foldingathome.org/releases/public/release/fahclient/debian-stable-64bit/v<span class="token variable">${FAH_VERSION_MAJOR}</span>.<span class="token variable">${FAH_VERSION_MINOR}</span>/fahclient_<span class="token variable">${FAH_VERSION_MAJOR}</span>.<span class="token variable">${FAH_VERSION_MINOR}</span>.<span class="token variable">${FAH_VERSION_PATCH}</span>_amd64.deb &#x26;&#x26;<span class="token operator">\</span>
        mkdir -p /etc/fahclient/ &#x26;&#x26;<span class="token operator">\</span>
        touch /etc/fahclient/config.xml &#x26;&#x26;<span class="token operator">\</span>
        dpkg --install *.deb &#x26;&#x26;<span class="token operator">\</span>
        apt-get autoremove -y &#x26;&#x26;<span class="token operator">\</span>
        rm --recursive --verbose --force /tmp/* /var/log/* /var/lib/apt/ &#x26;&#x26;<span class="token operator">\</span>
        mkdir /var/opt/folding</span>

<span class="token instruction"><span class="token keyword">WORKDIR</span> /var/opt/folding</span>

<span class="token instruction"><span class="token keyword">COPY</span> init.sh /init.sh</span>

<span class="token instruction"><span class="token keyword">ENTRYPOINT</span> [ <span class="token string">"/init.sh"</span> ]</span></code></pre></div><figcaption>Customized Dockerfile for FAH, inspired by <a href="https://hub.docker.com/r/johnktims/folding-at-home/?ref=ghost.justin.palpant.us">johnktims/folding-at-home</a></figcaption></figure><p>Because Folding@Home's client comes with a built-in web UI for manual configuration, I wanted to serve that UI over an authenticated, public web page. I typically use <a href="https://github.com/pusher/oauth2_proxy?ref=ghost.justin.palpant.us">pusher/oauth2_proxy</a> for this because it supports Sign-in with Google easily, and decided to use it here as well. I also wanted configuration from the web UI as well as work-in-progress to persist across container restarts. In a cloud environment this would mean using a PersistentVolume, but on this one-node cluster, I just use HostPath mounts. Likewise, the one-node cluster uses a lot of NodePort Services, instead of the Ingress resources that I like to use in the cloud.</p><figure class="kg-card kg-code-card"><div class="kg-card kg-code-card gatsby-highlight" data-language="yaml"><pre class="language-yaml"><code class="language-yaml"><span class="token key atrule">apiVersion</span><span class="token punctuation">:</span> apps/v1
<span class="token key atrule">kind</span><span class="token punctuation">:</span> StatefulSet
<span class="token key atrule">metadata</span><span class="token punctuation">:</span>
  <span class="token key atrule">name</span><span class="token punctuation">:</span> folding<span class="token punctuation">-</span>at<span class="token punctuation">-</span>home
  <span class="token key atrule">namespace</span><span class="token punctuation">:</span> prod
<span class="token key atrule">spec</span><span class="token punctuation">:</span>
  <span class="token key atrule">serviceName</span><span class="token punctuation">:</span> folding<span class="token punctuation">-</span>at<span class="token punctuation">-</span>home
  <span class="token key atrule">selector</span><span class="token punctuation">:</span>
    <span class="token key atrule">matchLabels</span><span class="token punctuation">:</span>
      <span class="token key atrule">app</span><span class="token punctuation">:</span> folding<span class="token punctuation">-</span>at<span class="token punctuation">-</span>home
  <span class="token key atrule">replicas</span><span class="token punctuation">:</span> <span class="token number">1</span>
  <span class="token key atrule">template</span><span class="token punctuation">:</span>
    <span class="token key atrule">metadata</span><span class="token punctuation">:</span>
      <span class="token key atrule">labels</span><span class="token punctuation">:</span>
        <span class="token key atrule">app</span><span class="token punctuation">:</span> folding<span class="token punctuation">-</span>at<span class="token punctuation">-</span>home
    <span class="token key atrule">spec</span><span class="token punctuation">:</span>
      <span class="token key atrule">imagePullSecrets</span><span class="token punctuation">:</span>
        <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> k8s<span class="token punctuation">-</span>gcr<span class="token punctuation">-</span>read<span class="token punctuation">-</span>only
      <span class="token key atrule">containers</span><span class="token punctuation">:</span>
      <span class="token punctuation">-</span> <span class="token key atrule">args</span><span class="token punctuation">:</span>
        <span class="token punctuation">-</span> <span class="token punctuation">-</span>provider=google
        <span class="token punctuation">-</span> <span class="token punctuation">-</span>google<span class="token punctuation">-</span>admin<span class="token punctuation">-</span>email=lab@palpant.us
        <span class="token punctuation">-</span> <span class="token punctuation">-</span>google<span class="token punctuation">-</span>group=folding<span class="token punctuation">-</span>access@palpant.us
        <span class="token punctuation">-</span> <span class="token punctuation">-</span>email<span class="token punctuation">-</span>domain=*
        <span class="token punctuation">-</span> <span class="token punctuation">-</span>google<span class="token punctuation">-</span>service<span class="token punctuation">-</span>account<span class="token punctuation">-</span>json=/sa/palpantlab<span class="token punctuation">-</span>main<span class="token punctuation">-</span>40b0f9caae61.json
        <span class="token punctuation">-</span> <span class="token punctuation">-</span>upstream=http<span class="token punctuation">:</span>//localhost<span class="token punctuation">:</span><span class="token number">7396</span>
        <span class="token punctuation">-</span> <span class="token punctuation">-</span>http<span class="token punctuation">-</span>address=0.0.0.0<span class="token punctuation">:</span><span class="token number">4180</span>
        <span class="token punctuation">-</span> <span class="token punctuation">-</span>redirect<span class="token punctuation">-</span>url=https<span class="token punctuation">:</span>//folding.palpant.us/oauth2/callback
        <span class="token key atrule">env</span><span class="token punctuation">:</span>
        <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> OAUTH2_PROXY_CLIENT_ID
          <span class="token key atrule">valueFrom</span><span class="token punctuation">:</span>
            <span class="token key atrule">secretKeyRef</span><span class="token punctuation">:</span>
              <span class="token key atrule">name</span><span class="token punctuation">:</span> transmission<span class="token punctuation">-</span>oauth2<span class="token punctuation">-</span>proxy<span class="token punctuation">-</span>account
              <span class="token key atrule">key</span><span class="token punctuation">:</span> client_id
        <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> OAUTH2_PROXY_CLIENT_SECRET
          <span class="token key atrule">valueFrom</span><span class="token punctuation">:</span>
            <span class="token key atrule">secretKeyRef</span><span class="token punctuation">:</span>
              <span class="token key atrule">name</span><span class="token punctuation">:</span> transmission<span class="token punctuation">-</span>oauth2<span class="token punctuation">-</span>proxy<span class="token punctuation">-</span>account
              <span class="token key atrule">key</span><span class="token punctuation">:</span> client_secret
        <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> OAUTH2_PROXY_COOKIE_SECRET
          <span class="token key atrule">valueFrom</span><span class="token punctuation">:</span>
            <span class="token key atrule">secretKeyRef</span><span class="token punctuation">:</span>
              <span class="token key atrule">name</span><span class="token punctuation">:</span> transmission<span class="token punctuation">-</span>oauth2<span class="token punctuation">-</span>proxy<span class="token punctuation">-</span>account
              <span class="token key atrule">key</span><span class="token punctuation">:</span> cookie_secret
        <span class="token key atrule">volumeMounts</span><span class="token punctuation">:</span>
        <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> transmission<span class="token punctuation">-</span>ui<span class="token punctuation">-</span>sa
          <span class="token key atrule">mountPath</span><span class="token punctuation">:</span> /sa
        <span class="token key atrule">image</span><span class="token punctuation">:</span> quay.io/pusher/oauth2_proxy<span class="token punctuation">:</span>v5.0.0
        <span class="token key atrule">imagePullPolicy</span><span class="token punctuation">:</span> Always
        <span class="token key atrule">livenessProbe</span><span class="token punctuation">:</span>
          <span class="token key atrule">httpGet</span><span class="token punctuation">:</span>
            <span class="token key atrule">scheme</span><span class="token punctuation">:</span> HTTP
            <span class="token key atrule">path</span><span class="token punctuation">:</span> /ping
            <span class="token key atrule">port</span><span class="token punctuation">:</span> web
          <span class="token key atrule">initialDelaySeconds</span><span class="token punctuation">:</span> <span class="token number">30</span>
          <span class="token key atrule">timeoutSeconds</span><span class="token punctuation">:</span> <span class="token number">30</span>
        <span class="token key atrule">name</span><span class="token punctuation">:</span> oauth2<span class="token punctuation">-</span>proxy
        <span class="token key atrule">ports</span><span class="token punctuation">:</span>
        <span class="token punctuation">-</span> <span class="token key atrule">containerPort</span><span class="token punctuation">:</span> <span class="token number">4180</span>
          <span class="token key atrule">name</span><span class="token punctuation">:</span> web
      <span class="token punctuation">-</span> <span class="token key atrule">image</span><span class="token punctuation">:</span> $<span class="token punctuation">{</span>CONTAINER_REGISTRY<span class="token punctuation">}</span><span class="token punctuation">:</span>$<span class="token punctuation">{</span>CI_COMMIT_SHA<span class="token punctuation">}</span>
        <span class="token key atrule">name</span><span class="token punctuation">:</span> folding
        <span class="token key atrule">imagePullPolicy</span><span class="token punctuation">:</span> Always
        <span class="token key atrule">ports</span><span class="token punctuation">:</span>
        <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> http
          <span class="token key atrule">containerPort</span><span class="token punctuation">:</span> <span class="token number">7396</span>
        <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> command
          <span class="token key atrule">containerPort</span><span class="token punctuation">:</span> <span class="token number">36330</span>
        <span class="token key atrule">livenessProbe</span><span class="token punctuation">:</span>
          <span class="token key atrule">httpGet</span><span class="token punctuation">:</span>
            <span class="token key atrule">scheme</span><span class="token punctuation">:</span> HTTP
            <span class="token key atrule">path</span><span class="token punctuation">:</span> /
            <span class="token key atrule">port</span><span class="token punctuation">:</span> http
        <span class="token key atrule">args</span><span class="token punctuation">:</span>
        <span class="token punctuation">-</span> <span class="token punctuation">-</span><span class="token punctuation">-</span>web<span class="token punctuation">-</span>allow=0/0
        <span class="token punctuation">-</span> <span class="token punctuation">-</span><span class="token punctuation">-</span>allow=0/0
        <span class="token punctuation">-</span> <span class="token punctuation">-</span><span class="token punctuation">-</span>cpu<span class="token punctuation">-</span>usage=35
        <span class="token punctuation">-</span> <span class="token punctuation">-</span><span class="token punctuation">-</span>session<span class="token punctuation">-</span>lifetime=0
        <span class="token punctuation">-</span> <span class="token punctuation">-</span><span class="token punctuation">-</span>session<span class="token punctuation">-</span>timeout=0
        <span class="token punctuation">-</span> <span class="token punctuation">-</span><span class="token punctuation">-</span>command<span class="token punctuation">-</span>enable=true
        <span class="token punctuation">-</span> <span class="token punctuation">-</span><span class="token punctuation">-</span>command<span class="token punctuation">-</span>address=0.0.0.0
        <span class="token punctuation">-</span> <span class="token punctuation">-</span><span class="token punctuation">-</span>command<span class="token punctuation">-</span>allow<span class="token punctuation">-</span>no<span class="token punctuation">-</span>pass=0/0
        <span class="token punctuation">-</span> <span class="token punctuation">-</span><span class="token punctuation">-</span>command<span class="token punctuation">-</span>port=36330
        <span class="token key atrule">env</span><span class="token punctuation">:</span>
        <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> NVIDIA_VISIBLE_DEVICES
          <span class="token key atrule">value</span><span class="token punctuation">:</span> <span class="token string">"all"</span>
        <span class="token key atrule">volumeMounts</span><span class="token punctuation">:</span>
        <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> folding<span class="token punctuation">-</span>at<span class="token punctuation">-</span>home<span class="token punctuation">-</span>data
          <span class="token key atrule">mountPath</span><span class="token punctuation">:</span> /var/opt/folding
      <span class="token key atrule">volumes</span><span class="token punctuation">:</span>
      <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> folding<span class="token punctuation">-</span>at<span class="token punctuation">-</span>home<span class="token punctuation">-</span>data
        <span class="token key atrule">hostPath</span><span class="token punctuation">:</span>
          <span class="token comment"># directory location on host</span>
          <span class="token key atrule">path</span><span class="token punctuation">:</span> /data/folding<span class="token punctuation">-</span>at<span class="token punctuation">-</span>home
          <span class="token comment"># this field is optional</span>
          <span class="token key atrule">type</span><span class="token punctuation">:</span> Directory
      <span class="token punctuation">-</span> <span class="token key atrule">name</span><span class="token punctuation">:</span> transmission<span class="token punctuation">-</span>ui<span class="token punctuation">-</span>sa
        <span class="token key atrule">secret</span><span class="token punctuation">:</span>
          <span class="token key atrule">secretName</span><span class="token punctuation">:</span> transmission<span class="token punctuation">-</span>ui<span class="token punctuation">-</span>sa
<span class="token punctuation">---</span>
<span class="token key atrule">apiVersion</span><span class="token punctuation">:</span> v1
<span class="token key atrule">kind</span><span class="token punctuation">:</span> Service
<span class="token key atrule">metadata</span><span class="token punctuation">:</span>
  <span class="token key atrule">name</span><span class="token punctuation">:</span> folding<span class="token punctuation">-</span>at<span class="token punctuation">-</span>home
  <span class="token key atrule">namespace</span><span class="token punctuation">:</span> prod
<span class="token key atrule">spec</span><span class="token punctuation">:</span>
  <span class="token key atrule">type</span><span class="token punctuation">:</span> NodePort
  <span class="token key atrule">ports</span><span class="token punctuation">:</span>
  <span class="token punctuation">-</span> <span class="token key atrule">targetPort</span><span class="token punctuation">:</span> web
    <span class="token key atrule">name</span><span class="token punctuation">:</span> web
    <span class="token key atrule">port</span><span class="token punctuation">:</span> <span class="token number">9092</span>
  <span class="token punctuation">-</span> <span class="token key atrule">targetPort</span><span class="token punctuation">:</span> command
    <span class="token key atrule">name</span><span class="token punctuation">:</span> command
    <span class="token key atrule">port</span><span class="token punctuation">:</span> <span class="token number">9093</span>
  <span class="token key atrule">selector</span><span class="token punctuation">:</span>
    <span class="token key atrule">app</span><span class="token punctuation">:</span> folding<span class="token punctuation">-</span>at<span class="token punctuation">-</span>home
<span class="token punctuation">---</span>
<span class="token key atrule">apiVersion</span><span class="token punctuation">:</span> networking.k8s.io/v1
<span class="token key atrule">kind</span><span class="token punctuation">:</span> NetworkPolicy
<span class="token key atrule">metadata</span><span class="token punctuation">:</span>
  <span class="token key atrule">name</span><span class="token punctuation">:</span> folding<span class="token punctuation">-</span>at<span class="token punctuation">-</span>home<span class="token punctuation">-</span>ui<span class="token punctuation">-</span>restrict
  <span class="token key atrule">namespace</span><span class="token punctuation">:</span> prod
<span class="token key atrule">spec</span><span class="token punctuation">:</span>
  <span class="token key atrule">podSelector</span><span class="token punctuation">:</span>
    <span class="token key atrule">matchLabels</span><span class="token punctuation">:</span>
      <span class="token key atrule">app</span><span class="token punctuation">:</span> folding<span class="token punctuation">-</span>at<span class="token punctuation">-</span>home
  <span class="token key atrule">policyTypes</span><span class="token punctuation">:</span>
  <span class="token punctuation">-</span> Ingress
  <span class="token key atrule">ingress</span><span class="token punctuation">:</span>
  <span class="token punctuation">-</span> <span class="token key atrule">from</span><span class="token punctuation">:</span>
    <span class="token punctuation">-</span> <span class="token key atrule">ipBlock</span><span class="token punctuation">:</span>
        <span class="token key atrule">cidr</span><span class="token punctuation">:</span> 192.168.0.10/32
    <span class="token key atrule">ports</span><span class="token punctuation">:</span>
    <span class="token punctuation">-</span> <span class="token key atrule">protocol</span><span class="token punctuation">:</span> TCP
      <span class="token key atrule">port</span><span class="token punctuation">:</span> <span class="token number">4180</span>
  <span class="token punctuation">-</span> <span class="token key atrule">from</span><span class="token punctuation">:</span>
    <span class="token punctuation">-</span> <span class="token key atrule">ipBlock</span><span class="token punctuation">:</span>
        <span class="token key atrule">cidr</span><span class="token punctuation">:</span> 192.168.0.0/28
    <span class="token punctuation">-</span> <span class="token key atrule">ipBlock</span><span class="token punctuation">:</span>
        <span class="token key atrule">cidr</span><span class="token punctuation">:</span> 127.0.0.1/28
    <span class="token key atrule">ports</span><span class="token punctuation">:</span>
    <span class="token punctuation">-</span> <span class="token key atrule">protocol</span><span class="token punctuation">:</span> TCP
      <span class="token key atrule">port</span><span class="token punctuation">:</span> <span class="token number">36330</span>
</code></pre></div><figcaption>StatefulSet, Service, and NetworkPolicy for FAH on my PC</figcaption></figure><p>A bit of finagling was needed to get this running, managing Secrets and tweaking the Dockerfile, but after a dozen commits or so, I was up and folding at <a href="https://folding.palpant.us/?ref=ghost.justin.palpant.us">folding.palpant.us</a> (private, access-controlled).</p><h3 id="-glass-shattering-"><em>*<em>Glass shattering</em></em>*</h3><p>Everything went swimmingly for a couple of days, churning out a few million Points of folding research and keeping my system comfortably warm, with high GPU utilization, when suddenly GPU <a href="https://gitlab.palpant.us/grafana/dashboard/snapshot/hNN4RpA9QKxkzmuYcxuVAOSMIU1vH2mG?ref=ghost.justin.palpant.us">utilization went way down</a>.</p><!--kg-card-begin: html--><iframe title="GPU Utilization Graph" ,="" src="https://gitlab.palpant.us/-/grafana/dashboard-solo/snapshot/hNN4RpA9QKxkzmuYcxuVAOSMIU1vH2mG?orgId=1&#x26;from=1583760530497&#x26;to=1584037526495&#x26;var-cluster=palpantlab-prometheus-sfo&#x26;var-hostname=ubuntu-node-01&#x26;var-node=All&#x26;var-maxmount=%2Fhome&#x26;var-env=&#x26;var-name=&#x26;panelId=179" style="width:100%" height="400px" frameborder="0"/><!--kg-card-end: html--><p>I was able to see some very obvious failure messages from the container's logs immediately - the GPU core was failing the work units it was assigned on startup with a cryptic note:</p><div class="kg-card kg-code-card gatsby-highlight" data-language="bash"><pre class="language-bash"><code class="language-bash">00:51:18:WU01:FS01:Starting
00:51:18:WU01:FS01:Running FahCore: /usr/bin/FAHCoreWrapper /var/opt/folding/cores/cores.foldingathome.org/v7/lin/64bit/Core_22.fah/FahCore_22 -dir 01 -suffix 01 -version <span class="token number">705</span> -lifeline <span class="token number">8</span> -ch
eckpoint <span class="token number">15</span> -gpu-vendor nvidia -opencl-platform <span class="token number">0</span> -opencl-device <span class="token number">0</span> -cuda-device <span class="token number">0</span> -gpu <span class="token number">0</span>
00:51:18:WU01:FS01:Started FahCore on PID <span class="token number">70</span>
00:51:18:WU01:FS01:Core PID:74
00:51:18:WU01:FS01:FahCore 0x22 started

<span class="token punctuation">..</span>.

00:51:18:WU01:FS01:0x22:Project: <span class="token number">11741</span> <span class="token punctuation">(</span>Run <span class="token number">0</span>, Clone <span class="token number">2360</span>, Gen <span class="token number">1</span><span class="token punctuation">)</span>
00:51:18:WU01:FS01:0x22:Unit: 0x000000018ca304f15e67d8cb67bdf2b9
00:51:18:WU01:FS01:0x22:Reading <span class="token function">tar</span> <span class="token function">file</span> core.xml
00:51:18:WU01:FS01:0x22:Reading <span class="token function">tar</span> <span class="token function">file</span> integrator.xml
00:51:18:WU01:FS01:0x22:Reading <span class="token function">tar</span> <span class="token function">file</span> state.xml
00:51:18:WU01:FS01:0x22:Reading <span class="token function">tar</span> <span class="token function">file</span> system.xml
00:51:19:WU01:FS01:0x22:Digital signatures verified
00:51:19:WU01:FS01:0x22:Folding@home GPU Core22 Folding@home Core
00:51:19:WU01:FS01:0x22:Version <span class="token number">0.0</span>.2
00:51:21:WU01:FS01:FahCore returned: INTERRUPTED <span class="token punctuation">(</span><span class="token number">102</span> <span class="token operator">=</span> 0x66<span class="token punctuation">)</span></code></pre></div><p>I tried to see if any of the usual suspects were the problem: pod crashes, issues with NVIDIA drivers or OpenCL, <a href="https://lmgtfy.com/?q=INTERRUPTED+(102+%3D+0x66)&#x26;ref=ghost.justin.palpant.us">googling obviously Googleable error messages</a>. Most solutions pointed at issues with the specific work unit, or a retriable failure, so I deleted the pod, deleted the data in the data directory, deleted everything I could think to reset - without success.</p><p>And so I ended up where every good software project ends up - trawling the forums and finally <a href="https://foldingforum.org/viewtopic.php?f=74&#x26;t=32073&#x26;start=60&#x26;ref=ghost.justin.palpant.us#p312093">asking for help</a>. The good people on foldingforum.org were helpful and engaged quickly to try to help, but I went in knowing that my setup is fairly uncommon, and, to be frank, I didn't think it would be worth it to help get my one oddly configured RTX 2080 online when there was so much other work to do.</p><p>I was able to isolate that the issue was a problem with my Kubernetes setup, specifically by running FAHClient manually on the desktop and then also running the Docker image I was using directly with <code class="language-text">docker run</code>. But even with that narrow scope, I wasn't able to figure out the problem.</p><h3 id="you-see-but-do-you-observe">You see, but do you observe?</h3><p>The next day, I happened to think to check my cluster resource usage Grafana dashboard - specifically, <a href="https://gitlab.palpant.us/grafana/dashboard/snapshot/pE5puXjQnq2hh00XZzT770PAeTd38wZn?ref=ghost.justin.palpant.us">the memory usage chart for this Pod</a>.</p><!--kg-card-begin: html--><iframe title="Folding At Home Memory Usage Graph" src="https://gitlab.palpant.us/-/grafana/dashboard-solo/snapshot/pE5puXjQnq2hh00XZzT770PAeTd38wZn?orgId=1&#x26;from=1584029877222&#x26;to=1584049165425&#x26;var-Node=All&#x26;var-namespace=All&#x26;var-pod=folding-at-home-0&#x26;var-cluster=palpantlab-prometheus-sfo&#x26;panelId=25" style="width:100%" height="400px" frameborder="0"/><!--kg-card-end: html--><p>That pattern was immediately familiar, and telling - it was a container hitting the 500Mi memory limit I had assigned, and then crashing. But typically the pod would be killed and the termination reason would be OOMKilled, and I would get notified via Alertmanager setup for having a PodFrequentlyRestarting. <em>Was one process being killed and not the pod? Is that possible?</em></p><p>Yes.</p><figure class="kg-card kg-bookmark-card"><a class="kg-bookmark-container" href="https://github.com/kubernetes/kubernetes/issues/50632?ref=ghost.justin.palpant.us"><div class="kg-bookmark-content"><div class="kg-bookmark-title">Container with multiple processes not terminated when OOM · Issue #50632 · kubernetes/kubernetes</div><div class="kg-bookmark-description">/kind bug What happened: A pod container reached its memory limit. Then the oom-killer killed only one process within the container. This container has a uwsgi python server which gave this error i...</div><div class="kg-bookmark-metadata"><img class="kg-bookmark-icon" src="https://github.githubassets.com/favicon.ico" alt="Folding@Home on Kubernetes"><span class="kg-bookmark-author">GitHub</span><span class="kg-bookmark-publisher">kubernetes</span></img></div></div><div class="kg-bookmark-thumbnail"><img src="https://avatars1.githubusercontent.com/u/13629408?s=400&#x26;v=4" alt="Folding@Home on Kubernetes"/></div></a></figure><p>That issue pointed me to look at the <code class="language-text">node_vmstat_oom_kill</code> metric that is exposed if you run <a href="https://github.com/prometheus/node_exporter?ref=ghost.justin.palpant.us">node_exporter</a>, and sure enough, something on my server <a href="https://gitlab.palpant.us/grafana/dashboard/snapshot/lIjyYQnNWGqD5gmFnrNhyexjNhgpm1Pd?ref=ghost.justin.palpant.us">was being OOM killed</a> approximately once per minute.</p><!--kg-card-begin: html--><iframe title="Out-of-memory kill rate graph" src="" data-src="https://gitlab.palpant.us/-/grafana/dashboard-solo/snapshot/lIjyYQnNWGqD5gmFnrNhyexjNhgpm1Pd?orgId=1&#x26;from=1583752659462&#x26;to=1584078980759&#x26;var-cluster=palpantlab-prometheus-sfo&#x26;var-hostname=ubuntu-node-01&#x26;var-node=All&#x26;var-maxmount=%2Fhome&#x26;var-env=&#x26;var-name=&#x26;panelId=181" style="width:100%" height="400px" frameborder="0"/><!--kg-card-end: html--><h3 id="clean-up">Clean up</h3><p>From this point, the resolution was simple - <a href="https://gitlab.palpant.us/justin/folding/-/commit/5c00a8415015105f5967e97bf7cba510503fdeaf?ref=ghost.justin.palpant.us">remove the offending memory limit</a>, <a href="https://gitlab.palpant.us/justin/folding/pipelines/866?ref=ghost.justin.palpant.us">deploy the change</a>.</p><p>Immediately after the work unit started up, I could see <a href="https://gitlab.palpant.us/grafana/dashboard/snapshot/J5nadbBuuFZJIUuXXJJn1UrvzcW5UsTH?ref=ghost.justin.palpant.us">memory usage</a> spike up well beyond where the limit was placed.</p><!--kg-card-begin: html--><iframe title="Memory usage spike graph" src="https://gitlab.palpant.us/-/grafana/dashboard-solo/snapshot/J5nadbBuuFZJIUuXXJJn1UrvzcW5UsTH?orgId=1&#x26;from=1584012717033&#x26;to=1584140948707&#x26;var-Node=All&#x26;var-namespace=All&#x26;var-pod=folding-at-home-0&#x26;var-cluster=palpantlab-prometheus-sfo&#x26;panelId=25" style="width:100%" height="400px" frameborder="0"/><!--kg-card-end: html--><p>Shortly after that I realized that <code class="language-text">node_exporter</code> wasn't running on <a href="https://gitlab.palpant.us/justin/palpantlab-infra/-/blob/master/README.md?ref=ghost.justin.palpant.us#palpantlab-gke-west1-b-01">my GKE cluster</a> nodes when I tried to read <code class="language-text">node_vmstat_oom_kill</code> for the nodes in that cluster, so I <a href="https://gitlab.palpant.us/justin/palpantlab-gitlab/-/commit/2140dbd327190fa2dfb229f2a97bef04430d3ac7?ref=ghost.justin.palpant.us">enabled it</a>.</p><h3 id="takeaways">Takeaways</h3><p>First, observability is important - exit codes and log messages are helpful, but cluster-level monitoring can give detail when a specific process is misbehaving, even if you know <em>nothing</em> about that process. FAHClient isn't open source (<a href="https://github.com/FoldingAtHome/fah-web-client/issues/6?ref=ghost.justin.palpant.us#issuecomment-597474010">yet!</a>), so my ability to dive deeper was limited, and knowing what I know now, this INTERRUPTED message and the corresponding code are likely very deep in the stack. Prometheus and Grafana made that irrelevant.</p><p>Second - <a href="https://foldingathome.org/?ref=ghost.justin.palpant.us">get folding</a>!</p><figure class="kg-card kg-embed-card kg-card-hascaption"><iframe width="480" height="270" src="https://www.youtube.com/embed/RGGzMQ2oFrA?feature=oembed" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""/><figcaption>F@H promo video by Bowman lab</figcaption></figure><p><em>Title image credit: Alissa Eckert, MS; Dan Higgins, MAM available at <a href="https://phil.cdc.gov/Details.aspx?pid=23311&#x26;ref=ghost.justin.palpant.us">https://phil.cdc.gov/Details.aspx?pid=23311</a></em></p>]]></content:encoded></item></channel></rss>