Dates and Time 135
Example 6.3 Finding the Writing Style of a Writer
Let us look at one more example of text processing. Suppose you are interested in
comparing the writing styles of different writers. One of the measures that you are
going to use for comparison is the average sentence length in words because some writ-
ers use simple, short sentences and some have a tendency to use complex, long sentences.
You also want to measure the average word length in terms of how many characters
per word. You have collected several representative samples, in electronic form, of the
writers you are comparing. Having just learned about the text functions in Excel you
are eager to apply them for the writer comparison task.
Let us assume that you have copied and pasted a sample writing you are going to use for
comparison in cell A1. Let us now see how we can count the number of words in the
sample text in cell A1. Recall that the LEN function gives us the number of characters
in a string and the TRIM function removes all excess spaces. We can thus measure the
length of the sample text in cell A1 by the following formula:
=LEN(TRIM(A1))
We know that after trimming the words in the text are separated by single spaces. Thus,
if we were to get the length of the text after removing spaces, we can find the number
of words in the text. Therefore, we can use the following formula to measure the text
length after removing the spaces:
=LEN(SUBSTITUTE(TRIM(A1),” “,””))
In the above formula, we are using the SUBSTITUTE function to remove all spaces. The
difference between the above two lengths incremented by 1 then gives us the count of
the number of words in the text. Next, let us try to find the number of sentences in the
text. To find the number of sentences, we can use the same idea that we used for finding
words. We will remove all periods (.) from the text using the SUBSTITUTE function
and then measure the length. The difference in two lengths will tell us the number
of periods and hence the number of sentences. Thus, we use the following formula to
measure length after removing all periods:
=LEN(SUBSTITUTE(SUBSTITUTE(TRIM(A1),“ “,””),”.”,””))
Once we have the number of sentences, the total number of words, and the total number
of characters, we can easily get the average words per sentence and characters per word
for comparison.
Dates and Time
A date in Excel is simply a serial number that represents the number of days since
January 0, 1900 (a virtual “date” created for Windows). For example, 1/1/1900 is 1 and
1/15/1999 is serial number 36175. By treating dates as numbers, we can use them in
formulas and manipulate them. Excel recognizes several formats for entering dates in a
worksheet. These formats and their corresponding Excel interpretations are shown in
Chapter 6 Working with Text, Dates, and Times136
Table 6.2. Basically, Excel assumes you are referring to the current year if you do not
specify the year in your date entry.
When you enter data in a cell in one of the valid date formats, Excel expects the data to
represent a valid date. If the data does not convert to a valid date, Excel treats the data
as text. For example, if you enter 14/17/1979 in a cell, it will be treated as text because
months only go up to 12.
The idea of a date serial number is extended in Excel to represent time. For example, 12
noon, January 15, 1999 is represented internally as 36175.5, and 3:00 PM, January 15,
1999 as 36175.625. The smallest fraction of time unit in Excel is one one-thousandth
of a second which is equivalent to 0.00000001157. Table 6.3 shows different times of
day and their equivalent serial numbers.
Just like dates, Excel recognizes certain entries as time. For example, 10:30 is automati-
cally recognized as 10:30 AM and 15:30 as 3:30 PM. When you need to enter both
date and time, Excel expects you to enter the date first and then the time with at least a
Table 6.2 Excel Formats and Interpretations for Dates in a Worksheet
Cell Entry Excel’s Interpretation
10-17-79 October 17, 1979
10/17/79 October 17, 1979
10/17 October 17 of the current year
10-17 October 17 of the current year
1979/10/17 October 17, 1979
Oct 17 October 17 of the current year
Table 6.3 Times of Day and Excel’s Equivalent Serial Numbers
Time of Day Time Serial Number
12:00:00 AM
0.00000000
3:00:00 AM
0.12500000
6:00:00 AM
0.25000000
10:30:00 AM
0.43750000
1:30:00 PM
0.56250000
10:30:00 PM
0.93750000

Get Computing with Excel and VBA now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.