preload preload preload preload

Caching Yum Package Updates to Simulate the Bandwidth Benefits of RHN Satellites


18th May 2007 Datacenter,Networks,Systems 0 Comments

Random View of the RHN Interface This months updates have meant that a lot of my servers now have outdated packages. Bandwidth isn’t really too much of problem when it comes to my RHEL servers as they sit on 100mbit connections in Telehouse and Texas, however all my home and office based CentOS servers sit on a mixture of 2Mbit SDSL, 20Mbit Cable and 8Mb ADSL connections. Thats not really a lot of bandwidth for what could be a couple of gigs of data, especially when you consider that most of these lines sit at around 80% capacity (I know I know, don’t ask….).

For those SysAdmins with hundreds of servers in a datacenter and can’t afford to saturate their links downloading all of these packages each and every time RedHat offer some products one of which is the RHN Satellite entitlement. Now I thought to myself that if RedHat can provide what they call RedHat Proxy for RHN Satellites then it is possible that I can do something similar for Yum. (I’m letting the RHEL servers update directly because they have plenty of bandwidth available and more importantly I don’t have any Satellite entitlements!).

The first thing I considered was utilising Squid on my trusty router o’ doom and then I realised that it is running RedHat 9, only has a 4Gb HDD and 128Mb of RAM. That idea was quickly scrapped. Then I realised that the CacheFlow 600 was DualNIC enabled and even with the neighbourhood trying to hammer my bandwidth it was still sitting quite happily at only 13% capacity.

After configuring the CacheFlow (since this article is more about the benefits of caching, coupled with the fact that I guess I’m the only person who has a CacheFlow at home I’ll skip that bit) it was a simple case of editing the yum.conf of each machine.

I didn’t want all traffic to be going through the CacheFlow (not sure why) and since the CacheFlow is not the default gateway for their side of the network it would have required a static route on the Router O’ Doom to send all traffic from the CentOS machines for the repo’s back through to the CacheFlow. Obviously it wouldn’t be wise to forget to filter for the source of the requests otherwise the CacheFlow’s requests would get sent back to itself and then I would probably end up in a whole new world of pain.

Adding a proxy for Yum to use is simple:

[main]
cachedir=/var/cache/yum
debuglevel=2
logfile=/var/log/yum.log
pkgpolicy=newest
distroverpkg=centos-release
tolerant=1
exactarch=1
retries=20
obsoletes=1
gpgcheck=1
proxy=http://172.16.0.7:8080

With that done I ran the first CentOS machine off on a little jaunt to do a full package update. With the CentOS Plus repository enabled the package list came to 178Mb, multiple that by 5 machines and thats 890Mb excluding overheads of course. Ok so maybe all of this is a bit overkill but hey its something to do and after all electricity is free isn’t it………

Initial Run:
Time: 26mins
Average Throughput: 146.4 kB/s

4 Remaining Servers:
Time: 9mins
Average Throughput: 580.2kB/s

Whilst the times and average speeds speak for themselves the MRTG graphs of what happened are even better:

CacheFlow Client Throughput (Cacheflow to Clients)
CacheFlow Client Throughput

 
CacheFlow Server Throughput (Cacheflow to Internet)
CacheFlow Server Throughput

 
The small ‘blip’ in traffic towards the internet is the CacheFlow checking whether anything had changed since it was last cached.

The benefits here are apparant even if at this scale it isn’t really worth the effort. However NAMOS isn’t about that, hell it’ll probably never be about that.