Cost recovery in the Unix world, Part 2
Production costs keep rising. Is it time to set up your Unix chargeback system? We focus here on disk and network cost recovery
Last month, we started looking at cost recovery for Unix systems. Charging users for services has been a part of mainframe computing for decades. With the rise in price and complexity of Unix systems, similar cost recovery models will be needed to pay for all of your systems. If you haven't already started charging customers based on usage, you may be soon. Now is the time to put together a chargeback strategy to ensure you'll be in good shape when it's time to start sending out the bills. (2,500 words)
n the first part of our look at Unix cost recovery, we dealt with two important issues: constructing a cost model for your operation and charging for CPU usage. Both can be fairly complex, but are critical to accurate cost recovery. Now, we'll examine charging for disk space, backups and network usage, and close with some tool recommendations and a bit of advice.
Charging for disk space
Compared to CPU chargeback models, accounting for disk usage is fairly straightforward. There are two schemes you should consider: charging for usage and charging for allocation.
In usage-based accounting, you periodically count how much disk space is consumed by each user, multiply by some cost factor, and bill accordingly. This kind of billing is useful when many users share a common filesystem, i.e., all the user accounts in /home. You can use the du command to traverse each user directory, generating the total space consumed for each user.
For this kind of accounting to work well, you must ensure that every user keeps his files in a predefined set of directories. You can use normal Unix access permission to enforce your disk usage rules. If you create an environment where different users can create files in a shared area, you'll need to create more complex scripts to find all the files owned by a specific user, and then add up all the space consumed by those files. This is not impossible, of course, but it does make things a bit more complicated.
The big problem with usage-based accounting is that no one is paying for the space they aren't using. This may seem obvious, but it can wreak havoc with your cost recovery plans. Charges for disk usage need to recover the total cost of your storage, not just the part in use. If you choose to go with usage-based accounting, you'll need to determine the average usage of your storage, much like you had to determine the duty cycle of your CPU to set a rate for your computing cost recovery.
As an example, let's buy a 9-gigabyte (GB) disk drive for $1,500. After formatting and creating a Unix filesystem, you'll have about 8.5 GB (or 8,700 megabytes) of usable space. Dividing $1,500 by 8,700, we come up with a cost of 17.25 cents per megabyte (MB). If we assume a three-year capital write down of the expense, we can divide this by 36 to yield a price of 0.47 cents per megabyte per month. As with our CPU model, however, we will only break even if the disk is fully utilized from the moment we install it. Because this is never the case, we need to raise our rates to cover the unused part of the disk. Assuming that, on average, 60 percent of the drive is in use, we should charge $1,500 divided by 5,200, divided by 36, or 0.80 cents per megabyte per month.
Even with this adjustment, our cost recovery varies wildly as users create and delete files. Costs can change depending on when you decide to measure disk usage. If you only count usage once a month, savvy users will delete or compress everything they can just before the end of the month to reduce costs, and then recover and uncompress at the start of the next billing cycle. I once worked in a shop where users would spool huge amounts of storage to scratch tapes every night at 11 p.m., knowing that disk charging occurred at midnight. At 1 a.m., they would spool it all back, thus saving a lot of money.
Allocation-based charging eliminates all of these problems. Instead of charging for each byte used, you allocate storage to users by the filesystem and charge a flat fee for the entire filesystem. Whether the user utilizes the space or not, he or she pays for it.
For this kind of chargeback, things get much easier if you enforce a minimum allocation amount. I'd recommend at least a 1-GB minimum. For our example 9-GB drive, we can carve out eight 1-GB filesystems, allowing for overhead and such. Our pricing is easy: $1,500 divided by 8 yields $187.50, divided by 36 to give a price of $5.20 per gigabyte per month. We don't care if the users ever use any of this space, so we don't have to adjust for average usage or fear huge changes in space consumed.
In general, users are happier with allocation-based chargeback since they always retain control over how they use their storage. A problem with usage-based charging is that all the space is in a shared pool, and one customer can use up all the available space, leaving everyone else in the lurch. Allocation-based chargeback gives users total control over their storage management because they pay for both the space they use and the space they will need at some point in the future. And no one can intrude on that extra space.
Customer management for disk chargeback is more difficult than CPU chargeback in that users can easily see how much disk drives cost on the open market and will sometimes compare your pricing to the cost of raw storage. Since disk pricing is dropping continuously, storage you added last year is now priced above prices you will set this year. When that 9-GB drive drops to $1,000, a smart user will question why he continues to be billed at the $1,500 rate when disk prices have dropped. You may want to adjust your initial pricing to be somewhat low the first year, equivalent to market prices the second year, and higher in the third year. This entails becoming good at predicting pricing in the disk storage market. If you become especially good at this, you may want to consider a shift from the systems administration field to more gainful employment in Las Vegas.
One more complicating factor: Few large installations buy disks by the single drive. Instead, they use large RAID arrays and other technology to deliver high performance and reliability. Your pricing must be computed on the total cost of your disk subsystem, which includes the overhead of the RAID controllers, cache units, Fibre Channel networks, etc. The cost per gigabyte for these systems is much higher than street pricing on single drives, and your users will be even more bothered by your higher disk pricing. You'll need to be ready to explain how the cost overhead of these systems is justified by the greater speed and reliability enjoyed by the users. This explanation will be as well received as the explanation from the Department of Defense outlining why toilet seats cost $600 each.
Charging for backups
Every production system has a robust backup system attached to it. Good backups aren't cheap, and you'll need to recover the cost of that backup system. While backup systems have a fixed cost in the form of the tape unit and robotic controllers, they have a variable cost due to the number of tapes used, retention periods, and backup policies.
The fundamental unit of usage in a backup system is a tape. What does it cost to keep a tape in the system? In round numbers, a 35- to 70-GB DLT tape costs about $100. Again, over a three-year period, you'll need to recover $3 each month for that tape. Just as important as the tape is the drive unit, and this will drive the price much higher.
Let's suppose the bean counters (and let's not forget -- you are one now!) were in a good mood, and you're currently the proud owner of an ATL P3000 DLT robotic tape library. This unit will hold up to 16 tape drives and include slots for 326 tape cartridges. It will cost you $200,000, but it's worth every penny.
For chargeback purposes, the important feature of the P3000 is the number of slots. A tape is useless unless it is in the unit and can be mounted in a drive. Those slots are valuable, and they have an easy-to-figure price. Capitalizing over three years, the P3000 will cost $5,555 a month. Dividing by 326, it costs $17 per month for each slot in the unit.
Now we have a real price for a tape: $3 for the media and $17 for the slot for a total of $20 per tape per month. A tape holds at least 35 GB, so this works out to around 57 cents per gigabyte per month for tape storage. Having priced allocation-based storage at $5.20 per gigabyte per month, you can add these together to get $5.77 per gigabyte per month for backed-up storage, which seems pretty reasonable.
This is all made more complicated by various tape rotation schedules. You might keep a weekly full backup and six incremental backups in the tape library, but rotate off three previous weekly backups to offsite storage. Keeping a tape offsite is much cheaper than keeping it in that expensive slot in the P3000, but you may want to charge a separate fee for retrieving an offsite tape. Given that offsite tapes are rarely mounted, a courier fee of $100 or more for each requested tape isn't unreasonable.
You may also allow tapes to live in the data center, but not in the tape unit. Mounts are easier (and thus cheaper), so you can set a separate price for this service. At this point, you need to consider what you want users to do: keep tapes in the unit, on the rack, or offsite. Adjust your pricing so that users have an incentive to do what you want them to do.
Charging for network
Of all the options open to an administrator implementing a cost recovery plan, I like charging for network usage the least. In today's environment, network usage is as much of the computing model as physical memory usage. Years ago, mainframes charged based on your usage of core memory during execution. Memory has become cheaper, and this kind of chargeback has, consequently, almost disappeared. In the same way, network bandwidth is practically becoming free, and it doesn't make sense to charge for something that is nearly ubiquitous.
Still, networks aren't free, and you need to recover the cost of those switches, hubs and wire. One way to think about networking is to burden the cost of the infrastructure across all of your systems, adding a fraction of the total monthly cost of the network hardware to each system's hardware cost. This amount is then used to drive the pricing of CPU usage. This model, while not perfect, allows you to recover networking fees as you recover regular systems fees.
If you are considering network chargeback models, try to avoid models that set prices at the packet level. Beyond the fact that tracking packet usage back to specific users is almost impossible, packet-based accounting discourages people from using network resources. One of the basic tenets of Unix-based computing is that the network is a critical component of the architecture. We want to encourage developers to use the network as much as possible, distributing workload and resources across multiple machines. This is especially true of Web-based applications, where increased hit rates on servers are a measure of success that should be rewarded, not punished.
Using accounting tools
Unlike the mainframe world, where robust accounting tools are readily available, Unix accounting tools lag well behind in terms of features and usability.
To be honest, simple system accounting is easily handled by the standard Unix accounting tools. With a few shell and Perl scripts, you can extract enough information to implement most billing systems. Unless you want fancy reporting tools and fancy charts, you may never need additional tools.
If you need fancier stuff, or want to recover costs for database usage, you will want to look for third-party chargeback tools. One tool I've used with some success is ARSAP, made by GEJAC. This tool will probe your Oracle databases and extract accounting information that can be used for cost recovery. The ARSAP product has also been integrated into the CIMS system accounting product from Platinum Technology. With either version, you can expect to do a good amount of customizing to match both your system and your specific chargeback scheme.
Advice and suggestions
The hardest part of implementing Unix cost accounting and recovery has nothing to do with technology. Instead, it is your approach and the way you handle your customers.
From the administration side, your first inclination will be to engineer a chargeback scheme just like you've engineered your entire computing environment -- with slavish attention to detail and an assurance that you've accounted for every last jot and tittle of usage. Speaking from experience, I can tell you that you need to lose this attitude.
The goal is not to create an accounting system that captures and bills everything in gory detail. The goal is to create a system that is fair, consistent, and understood by your customers. If your customers think they're being fairly charged an amount equal to the value of your services, you have a successful accounting system. Don't worry if some processes in the system fall through the cracks, or if you missed some amount of overhead in your disk subsystem. If the users are happy, you're happy.
Keeping users happy is hard. No one likes to pay for anything -- especially people who are used to getting things for free. You'll need to work with them and help them understand why you're charging, and how they will benefit in terms of better accountability, easier budgeting, and more readily available computing resources. Most importantly, be ready to compromise in your chargeback schemes based on user suggestion. For heavy disk usage environments, you may want to recover everything through disk usage charges, dropping CPU cost recovery altogether. Compute-intensive shops may do just the opposite, letting disk usage be "free."
Be flexible and creative; the only goal is to ensure you recover enough money to run your shop, pay for expansion, and keep systems upgraded and up to date. If you hit that goal, make users happy, and still retain your sanity, you'll definitely be considered a great business success in the world of systems administration.
About the author
Chuck Musciano has been running various Web sites, including the HTML Guru Home Page, since early 1994. He serves up HTML tips and tricks to hundreds of thousands of visitors each month. He's been a beta tester and contributor to the NCSA httpd project and speaks regularly on the Internet, World Wide Web, and related topics. Chuck was formerly SunWorld's Webmaster columnist and is currently CIO at the American Kennel Club. Reach Chuck at email@example.com.
If you have technical problems with this magazine, contact firstname.lastname@example.org