10.5 Disk Subsystem Planning 209
Chapter 10
HP’s observations during testing at about 100 requests/second through-
put rate showed the following average network traffic characteristics:
n Traffic to clients: 9000KB/sec
n WFE traffic (read/write): 11000KB/sec receive, 20000KB/sec send
n WFE traffic (read only): 5000KB/sec receive, 20000KB/sec send
n SQL traffic (read/write): 4000KB/sec receive, 10,000KB/sec send
n SQL traffic (read only): 2000KB/sec receive, 10000KB/sec send
If you are going to deploy SharePoint in a datacenter with a 100MBit
backbone, we recommend you to create two virtual networks in order to sep-
arate client traffic from the internal SQL and search traffic to ensure no net-
work throughput bottlenecks occur.
10.5 Disk Subsystem Planning
SharePoint 2007 disk subsystem planning is slightly trickier than most appli-
cations, due to the database disaster recovery options and various content
indexes generated outside the database. Especially in a farm configuration,
you should carefully plan the I/O capacity requirements ensure ample space
for future growth. Luckily with today’s copious storage technology options,
which have brought down the per-gigabyte cost of storage, you have options
when deciding upon your storage platform.
Shared Storage Area Networks (SANs) are pretty much the norm in
many enterprise datacenters, providing cost savings and easy upgrade paths.
Since SQL fully supports SAN storage, you can take advantage of this tech-
nology across your SharePoint deployment. Still, even with a SAN you have
to define a slice from the shared storage pie that should be allocated to you in
the beginning. If you’re choosing to use directly attached storage instead, it’s
even more important to consider all future requirements beforehand, since
adding capacity might not be as straightforward on a directly attached stor-
age device.
The hard part is trying to come up with an accurate estimate of the “con-
tent stored” part for an enterprise with thousands of users. Personally, I’ve been
bitten by this in the past and it appears I did not learn my lesson. Thus, from
bitter experience I recommend that you think of the worst possible scenario
and double it. That just might get you through the first 12 months and estab-
lish more accurate storage growth patterns for your enterprise.
210 10.5 Disk Subsystem Planning
10.5.1 Database Sizing
First and foremost, database sizing should be your primary concern.
Although it’s easy to add additional data files to SQL Server, you should try
to anticipate usage well in advance, since sometimes the order process might
take a while. With directly attached storage devices, you might also need to
plan for system downtime unless the devices support adding capacity online,
which SANs at least typically do.
Aside from the standard 1.5GB per farm configuration database, you
need to estimate the initial size of data moving to the SharePoint deployment
and count in the estimated number of new sites, keeping in mind the site
quota limits, to get a grasp on the required disk space for your deployments
SQL databases. Once you come up with the magic figure, you need to multi-
ple the total content storage size by 1.3 to get the item storage size in SQL
databases, due to additional item properties and associated overhead. In
addition, it is generally recommended to forecast a healthy growth plan of at
least 50 percent of the initial projected storage so that you do not run out of
disk space right after go-live. We’ve seen many organizations discount the
almost viral-like growth team collaboration sites seem to exhibit in most
organizations, at the cost of possibly painful storage upgrades. For example,
in HP we had to build a whole new farm to facilitate SharePoint 2003 stor-
age growth, because the original projected growth was far surpassed by the
massive upsurge in team collaboration site usage. At the time, the SQL team
recommended against SAN storage and insisted on directly attached storage
enclosures instead, which have a limited growth span. Safe to say, we learned
our lesson in time for the SharePoint 2007 deployment.
The projected content storage requirements will help you plan and
implement your content database structure, since Microsoft best practices
recommend a 100GB limit for content databases. With HP’s SharePoint
deployment, for example, closing in on the 6TB mark at the time of writing,
planning for creating and maintaining these content databases was a top pri-
ority at design phase. For each of the 100GB content databases, you should
also project at least 35GB for the transaction log.
For further details on the art of physical database storage design, you
should refer to the Microsoft TechNet article, located at http://
www.microsoft.com/technet/prodtechnol/sql/2005/physdbstor.mspx.
10.5.2 Search Index Sizing
In our experience you should expect index to be about 30 percent of the total
indexed content size. However, the exact size of the index corpus is largely
dependent on the content types you are going to be indexing. However, it
pays to be extra safe, as the content index databases cannot be spread across
Get Microsoft SharePoint 2007 Technologies now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.