In this section we sketch a few ideas for programmer-defined classes that illustrate useful basic features of classes.
This is the Stack class promised in Chapter 3. A stack is a dynamic storage mechanism, where items are added or removed one at a time, and only the most recently added item is accessible (like a stack of plates). You understand that it’s completely unnecessary to implement a Stack class from scratch in REALbasic, because REALbasic provides dynamic arrays! So, we could easily implement a stack by using an array, adding items with Append and removing them with Remove. However, the example provides a flexible model that can easily be modified to form other storage mechanisms such as queues, and illustrates the power of object references as pointers to form linked data structures.
Our implementation, you recall, is that each item of the stack will
contain a value and a pointer back to the previously added item of
the stack; the earliest added item will point to
nil. In Chapter 3 these were
called ItsValue and ItsPointer, so let’s keep those names.
We’ll make it a stack of strings; so ItsValue is declared as
string, and ItsPointer is declared as Stack. The chief actions on a
stack are to add (push) or remove
(pop) an item at the front; it’s also
useful to know whether the stack is empty. The basic implementation,
shown in Example 4-1, is a simple matter of
Example 4-1. Stack class
// stack.push: Sub push(s as string) dim newItem as stack newItem = new stack newItem.itsPointer = self.itsPointer newItem.itsValue = s self.itsPointer = newItem End Sub // stack.pop: Function pop() As string dim value as string value = self.itsPointer.itsValue self.itsPointer = self.itsPointer.itsPointer return value End Function // stack.isEmpty: Function isEmpty() As boolean return (self.itsPointer = nil) End Function
We may test the Stack class as follows:
dim st as stack dim s as string st = new stack st.push "hey" st.push "ho" s = st.pop() // throw away, just testing st.push "hey" st.push "nonny" st.push "no" while not st.isEmpty // pop and show whole stack msgBox st.pop() // expect "no", "nonny", "hey", "hey" wend
The onus is on the caller not to pop an empty stack; in real life, Pop would probably raise an exception if the stack is empty, as described in Chapter 8. The example is remarkably economical. You might think we’d need two classes, one for the items of the stack and one to represent the stack as whole; instead, the reference that points at the stack is itself just like the items of the stack, but it doesn’t store anything in ItsValue.
We would like to extend the power of arrays. We cannot subclass an array, because an array is not a class. But we can make a class that has an array as a member and provides an extended interface to it. Such a class is called a wrapper. This section presents a class CArray that is a wrapper for a one-dimensional array of strings.
There are several advantages to this class over a normal array. We will be able to define new array operations that ordinary arrays lack: for instance, a Swap function will cause the values of two items to be interchanged, a Reverse function will allow us to sort either ascending or descending, a CopyTo function will copy all items from one array into another, and a Clone method will create a new instance and CopyTo that. Also, it will be easy to subclass CArray to provide for even more specialized types of arrays; imagine, for example, an array that is constantly kept sorted by means of a binary search. Most important, an array wrapper class is a real class, as opposed to an ordinary array in REALbasic, which is neither a class nor a datatype, neither object nor scalar; an array wrapper class is a full citizen and can do things that an ordinary array cannot. For example, you cannot declare an array of array, but you can declare an array of a class that wraps an array; and a class that wraps a multidimensional array can be passed as a parameter to a subroutine (unlike an ordinary multidimensional array).
Example 4-2 gives a sample implementation. We
provide wrapper functions for all the standard array operations,
using the standard names (except that we cannot name a method Redim
because that’s a reserved word, so MyRedim is used), plus Set
and Get (because we can’t overload the assignment operator),
plus our various extensions. The private array property is called
TheArray; it is declared as being of size
initially, but a constructor is provided that lets us create a new
CArray instance and Redim it to the desired size in a single command.
Example 4-2. cArray class
// cArray.cArray: Sub cArray() // constructor with no parameters // do nothing, accept declaration to -1 End Sub // cArray.cArray: Sub cArray(size as integer) // constructor with one parameter self.myRedim(size) End Sub // cArray.myRedim: Sub myRedim(size as integer) redim self.theArray(size) End Sub // cArray.set: Sub set(index as integer, s as string) self.theArray(index) = s End Sub // cArray.get: Function get(index as integer) As string return self.theArray(index) End Function // cArray.insert: Sub insert(index as integer, s as string) self.theArray.insert index, s End Sub // cArray.append: Sub append(s as string) self.theArray.append s End Sub // cArray.remove: Sub remove(index as integer) self.theArray.remove index End Sub // cArray.ubound: Function ubound() As integer return ubound(self.theArray) End Function // cArray.sort: Sub sort() self.theArray.sort End Sub // cArray.swap: Sub swap(index1 as integer, index2 as integer) dim s as string s = self.get(index1) self.set(index1, self.get(index2)) self.set(index2, s) End Sub // cArray.reverse: Sub reverse() dim i, u, u2 as integer u = self.ubound() u2 = u\2 for i = 0 to u2 self.swap(i, u-i) next End Sub // cArray.copyTo: Sub copyTo(c as cArray) dim i, u as integer u = min(self.ubound(), c.ubound()) for i = 0 to u c.set(i, self.get(i)) next End Sub // cArray.clone: Function clone() As cArray dim c as cArray c = new cArray(self.ubound()) self.copyto(c) return c End Function // cArray.display: Sub display() // for debugging dim i, u as integer u = self.ubound() for i = 0 to u msgbox self.get(i) next End Sub
Here is a brief test routine:
dim a, b, c as cArray a = new cArray(4) b = new cArray(2) a.set(0,"hey") a.set(1,"ho") a.set(2,"hey") a.set(3,"nonny") a.set(4,"no") msgbox "Here is a:" a.display // expect "hey", "ho", "hey", "nonny", "no" a.copyTo b msgbox "Here is b:" b.display // expect "hey", "ho", "hey" c = a.clone() c.sort c.reverse c.set(0, "yo") c.myRedim(2) msgbox "Here is c:" c.display // expect "yo", "no", "ho"
A hash is a storage structure that’s like a pigeonhole desk for storing index cards: a glance at a card tells us instantly which pigeonhole it goes into, but inside each pigeonhole the cards are just a jumble. The means used to identify the correct pigeonhole is called the hash function. For example, if the cards contain text, the first letter of the text on a card to be stored could act as a hash function, resulting in 26 pigeonholes. This wouldn’t be a very good hash function if all the texts started with the same few letters, or if there were many index cards, because some pigeonholes would end up with a jumble of many cards. But when the hash function and the nature and size of the data are such that we are likely to end up with an even distribution where just a very few items end up in each pigeonhole, hashing is a very fast and efficient mode of storage.
Hashing is typically a two-stage process: first use the hash function to choose the correct pigeonhole; then manipulate the jumble which that pigeonhole represents. So apart from the hash function, we must also decide how to implement the jumble. We’ll use a CArray, the string array wrapper class developed in the previous example.
Let’s say, then, that the problem is to store every individual word of a text, without storing any duplicates. We propose to implement a hash of strings as an array of CArray—except that to prevent duplicates we will need to be able to ask each CArray whether it already contains a word, so we will use a subclass of CArray called MyCArray which adds a Contains method handler:
Function contains(s as string) As boolean dim i, u as integer u = self.ubound for i = 0 to u if self.get(i) = s then return true end next return false End Function
For the sake of generality and flexibility, we will factor out the hash function from the class that does the hashing; that is, we will tell our hashing class instance in code what hash function to use, rather than hardcoding this information into the class itself. A different hash function can thus be substituted without modifying the class that does the hashing. To accomplish this goal, the hash function will be expressed by a class that implements a class interface. Call this class interface HashFuncPtr, and give it a method called HashFunc which accepts a string and returns an integer. This means that any class that has a HashFunc method can act as a HashFuncPtr, and can be used by our hashing class to derive the index for any string. (This technique is REALbasic’s equivalent of pointer-to-function.)
For this example, we will use a very fast but extraordinarily primitive and unrealistic hash function: we will sum the numeric equivalents of the first and last letters of the string (these numeric equivalents are discussed in Chapter 5). Create a class PrimitiveHashFunc which implements HashFuncPtr; its HashFunc method goes like this:
Function hashFunc(s as string) As integer return ascb (leftb(s, 1)) + ascb (rightb(s, 1)) End Function
We come at last to the class that does the hashing, which we call
Hasher. It has two properties, ItsArray (the array of MyCArray,
initially declared to size
-1) and ItsHashFunction
(the HashFuncPtr containing the hash function). The code, displayed
in Example 4-3, is minimal: we can store strings,
but we cannot retrieve them except for debugging purposes. The
instance is initialized through its constructor, which needs to be
told how big the array should be and what hash function to use.
Storage is then performed through the InsertIfAbsent method. The
GetHashIndex method just abstracts the code for calling the hash
Example 4-3. Hasher class
// hasher.hasher: Sub hasher(size as integer, f as hashFuncPtr) dim i as integer redim itsArray(size) for i = 0 to size self.itsarray(i) = new myCArray(-1) next itsHashFunction = f End Sub // hasher.getHashIndex: Function getHashIndex(s as string) As integer return self.itsHashFunction.hashFunc(s) End Function // hasher.insertIfAbsent: Sub insertIfAbsent(s as string) dim i as integer i = self.getHashIndex(s) if self.itsArray(i).contains(s) then // nothing to do return end self.itsArray(i).append s End Sub // hasher.displayDump: Function displayDump() As string // for debugging dim result as string dim i, j, u, uu as integer uu = ubound(itsArray) for i = 0 to uu u = self.itsArray(i).ubound for j = 0 to u result = result + "," + self.itsArray(i).get(j) next next return result End Function
Here is some code to test Hasher. Since the maximum numeric equivalent of a letter is 255, the largest hash value possible is 510.
dim h as hasher h = new hasher(510, new primitiveHashFunc) // simple test just to show that it really is hashing h.insertIfAbsent("this") h.insertIfAbsent("is") h.insertIfAbsent("a") h.insertIfAbsent("test") h.insertIfAbsent("of") h.insertIfAbsent("this") h.insertIfAbsent("and") h.insertIfAbsent("this") h.insertIfAbsent("is") h.insertIfAbsent("not") msgbox h.displayDump // expect to see each distinct word once
A map is a collection of name-value pairs, a way of storing a value keyed by an arbitrary name (as opposed to an array, which stores a value keyed by an index number). Given a name, you can add it and a corresponding value to the collection, or, if the name is already defined in the collection, you can retrieve or modify its value. REALbasic provides a class which behaves this way (the Collection class, discussed in Chapter 5), but it’s inefficient, and besides, it’s good exercise to create one for ourselves.
Let’s pose a problem like that of the previous section: we wish to store every individual word of a text, without storing any duplicates; but this time we will attach to each word a count of how many times it occurs in the text. We are thus pairing a name (the word) with an integer (the count). So we create a class, Pair, to express this pairing, with two properties, Key and Value. Key, obviously, is a string. You might think Value would be an integer, but for reasons that will be clear later on, we choose to make it an IntPtr, which is a class consisting of a single integer property named V (for “value”). Pair will have a constructor:
Sub pair(s as string) self.key = s self.value = new intPtr End Sub
This initializes the Key to the string value passed in, and the Value to an IntPtr whose V will be autoinitialized to zero without any help from us.
The pairs will be maintained as an array. Since arrays aren’t classes, we will use a CArray, the array wrapper class developed earlier. Unfortunately, CArray is a string array wrapper; now we need it to be a Pair array wrapper. This means we’ll have to rewrite CArray, running through every method and property declaration and all the code, making the appropriate changes. (We will also have to disable the Sort method, because we presently have no way to sort an array of objects.) The full implementation is left as an exercise for the reader, but as an example, here’s CArray’s revised Set method:
Sub set(index as integer, s as pair) self.theArray(index) = s End Sub
Are we really going to have to go through this maddening exercise with CArray every time we change the datatype of the array? No, thank heavens; in Chapter 5 we’ll meet the variant datatype, and with its help we will rewrite CArray one last time, so that it wraps an array of anything without further rewriting. The sorting problem is easily solved with the help of Thomas Tempelmann’s Sort class, mentioned earlier in this chapter.
We next face the problem of how the CArray will be maintained for efficient access. Given a key, we need to know quickly whether the CArray contains a Pair with that key. Clearly the worst solution is to keep the pairs in any old order; they should be kept sorted. This sounds like a job for binary search, the algorithm for rapid access to a sorted array described under Section 2.6.
For the sake of clarity and flexibility, we’ll package up the binary search algorithm as a class, which we’ll call BinarySearch. Give it one property, a CArray which we’ll just call A. Things are a little trickier than they were back in Chapter 2 because a search now needs to hand back two pieces of information: First, was the desired key found? If so, at what index? If not, at what index should it be inserted so as to maintain the array in sorted order? So we’d like BinarySearch to have a Find method that will return two values, Index (an integer) and Found (a boolean). But a subroutine can’t return two values, so we’ll package up this returned value as a class called IndexAndFound, consisting of two properties, Index (an integer) and Found (a boolean), and with a constructor that sets them:
Sub indexAndFound(index as integer, found as boolean) self.index = index self.found = found End Sub
The BinarySearch class is shown as Example 4-4. The constructor points A at the correct CArray. FindBinaryRecursive is identically the algorithm from Chapter 2, except that we are now looking at strings, not integers, and we must change slightly the way we speak in order to access an element of the array because it is now a CArray of Pairs. The public interface to BinarySearch is the Find method: it wraps the call to FindBinaryRecursive in a test for boundary conditions and in an adjustment of the Index in case we don’t find the desired key (in that case, remember, the Index must say where to insert a new Pair), and hands back the result as an IndexAndFound.
Example 4-4. BinarySearch class
// binarySearch.binarySearch: Sub binarySearch(a as cArray) self.a = a End Sub // binarySearch.find: Function find(whatStr as string) As indexAndFound dim index as integer dim found as boolean dim foundStr as string if a.ubound < 0 or whatStr < a.get(0).key then return new indexAndFound(0, false) end index = findBinaryRecursive(0, a.ubound, whatStr) foundStr = a.get(index).key found = (foundStr = whatStr) if not found then if whatStr > foundStr then index = index + 1 end end return new indexAndFound(index, found) End Function // binarySearch.findBinaryRecursive: Function findBinaryRecursive(low as integer, hi as integer, whatStr as string) As integer dim i as integer dim s as string i = (low + hi) / 2 s = a.get(i).key if s = whatStr then return i end if hi <= low then return hi end if s > whatStr then return findBinaryRecursive(low, i-1, whatStr) else return findBinaryRecursive(i+1, hi, whatStr) end End Function
We come at last to the Map class itself. It has two properties. One is Pairs, the CArray in which the name-value pairs will be kept. The other is B, the BinarySearch instance that will be used to access the binary search algorithm. Both properties exemplify typical uses of the "Has-A” relationship between classes. Map has a CArray because that’s its data. Map has a BinarySearch because the binary search algorithm is Map’s servant and no one else’s, and because this particular BinarySearch instance is to be pointed only at this particular CArray.
The Map class is shown as Example 4-5. A constructor initializes the Pairs CArray, and creates the BinarySearch instance, pointing the latter’s CArray pointer at the former; the two properties are thus correctly hooked up at the outset. The Get and Size methods are very simple and are thrown in mostly for debugging purposes. The Refer method is the workhorse of the class; given a key, it either returns its corresponding Value, or else (if no Pair with that key exists) it creates a new Pair with that key at the appropriate index and returns its corresponding Value.
You should now see why a Pair’s Value is an IntPtr, not a mere
integer. It isn’t Map’s job to understand how it is being
used; it’s the job of whoever creates and manipulates the Map
instance. Therefore, we don’t wish merely to tell the caller of
Refer what the value is; we wish to give the caller complete access
to that value, so that the caller can modify it as desired. Ideally
we’d like to return an integer
ByRef, but a
function can’t return a
ByRef result; so we return an instance of a class
that wraps the integer and behaves as a pointer to it. This is a
REALbasic example of what C++ would call “returning a
Example 4-5. Map class
// map.map: Sub map() pairs = new cArray b = new binarySearch(pairs) End Sub // map.refer: Function refer(s as string) As intptr dim result as indexAndFound result = b.find(s) if not result.found then if result.index > pairs.ubound then pairs.append new pair(s) else pairs.insert result.index, new pair(s) end end return pairs.get(result.index).value End Function // map.get: Function get(i as integer) As intptr return pairs.get(i).value End Function // map.size: Function size() As integer return pairs.ubound End Function
To test Map for the purposes of our original problem, we make a utility subroutine Increment which accepts an IntPtr and increments the integer it points to:
Sub increment(ip as intptr) ip.v = ip.v + 1 End Sub
Here’s a test routine:
dim m as map dim i,u as integer m = new map increment(m.refer("this")) increment(m.refer("is")) increment(m.refer("a")) increment(m.refer("test")) increment(m.refer("of")) increment(m.refer("this")) increment(m.refer("and")) increment(m.refer("this")) increment(m.refer("is")) increment(m.refer("not")) u = m.size for i = 0 to u msgbox m.get(i).key + ":" + str(m.get(i).value.v) next
Sure enough, we are counting the word occurrences; we get
a:1, and:1, is:2, and so forth, as expected. Aside
from exemplifying many powerful REALbasic class techniques, the Map
class is of great practical utility; I really did write it originally
to handle the task of counting the occurrences of distinct words in a
document, and on my machine it can accumulate the data for a
4500-word document, containing about 1400 distinct words, in slightly
over a second.