Chapter 1. Fundamentals of UML
On the surface, the Unified Modeling Language (UML) is a visual language for capturing software designs and patterns. Dig a little deeper, though, and you’ll find that UML can be applied to quite a few different areas and can capture and communicate everything from company organization to business processes to distributed enterprise software. It is intended to be a common way of capturing and expressing relationships, behaviors, and high-level ideas in a notation that’s easy to learn and efficient to write. UML is visual; just about everything in it has a graphical representation. Throughout this book we’ll discuss the meaning behind the various UML elements as well as their representations.
If you’re new to UML, you should be sure to read this chapter all the way through to get acquainted with the basic terminology used throughout the book. If you are a developer, class diagrams tend to be the simplest diagrams to start with because they map closely to code. Pick a program or domain you know well, and try to capture the entities involved using classes. Once you’re convinced you’ve modeled the relationships between your entities correctly, pick a piece of functionality and try to model that using a sequence diagram and your classes.
If you’re more of a process person (business or otherwise), you may be more comfortable starting with an activity diagram. Chapter 9 shows examples of modeling business processes with different groups (Human Resources, IT, etc.) and progresses to modeling parallel processes over different geographic regions.
UML has become the de facto standard for modeling software applications and is growing in popularity in modeling other domains. Its roots go back to three distinct methods: the Booch Method by Grady Booch, the Object Modeling Technique coauthored by James Rumbaugh, and Objectory by Ivar Jacobson. Known as the Three Amigos, Booch, Rumbaugh, and Jacobson kicked off what became the first version of UML, in 1994. In 1997, UML was accepted by the Object Management Group (OMG) and released as UML v1.1.
Since then, UML has gone through several revisions and refinements leading up to the current 2.0 release. Each revision has tried to address problems and shortcomings identified in the previous versions, leading to an interesting expansion and contraction of the language. UML 2.0 is by far the largest UML specification in terms of page count (the superstructure alone is over 600 pages), but it represents the cleanest, most compact version of UML yet.
First and foremost, it is important to understand that UML is a language. This means it has both syntax and semantics. When you model a concept in UML, there are rules regarding how the elements can be put together and what it means when they are organized in a certain way. UML is intended not only to be a pictorial representation of a concept, but also to tell you something about its context. How does widget 1 relate to widget 2? When a customer orders something from you, how should the transaction be handled? How does the system support fault tolerance and security?
Communicating software or business processes
Capturing details about a system for requirements or analysis
Documenting an existing system, process, or organization
UML has been applied to countless domains, including:
Banking and investment sectors
Retail sales and supply
The basic building block of UML is a diagram. There are several types, some with very specific purposes (timing diagrams) and some with more generic uses (class diagrams). The following sections touch on some of the major ways UML has been employed. The diagrams mentioned in each section are by no means confined to that section. If a particular diagram helps you convey your message you should use it; this is one of the basic tenants of UML modeling.
Because UML grew out of the software development domain, it’s not surprising that’s where it still finds its greatest use. When applied to software, UML attempts to bridge the gap between the original idea for a piece of software and its implementation. UML provides a way to capture and discuss requirements at the requirements level (use case diagrams), sometimes a novel concept for developers. There are diagrams to capture what parts of the software realize certain requirements (collaboration diagrams). There are diagrams to capture exactly how those parts of the system realize their requirements (sequence and statechart diagrams). Finally there are diagrams to show how everything fits together and executes (component and deployment diagrams).
Books describing previous versions of UML made a point to emphasize that UML was not a visual programming language; you couldn’t execute your model. However, UML 2.0 changes the rules somewhat. One of the major motivations for the move from UML 1.5 to UML 2.0 was to add the ability for modelers to capture more system behavior and increase tool automation. A relatively new technique called Model Driven Architecture (MDA) offers the potential to develop executable models that tools can link together and to raise the level of abstraction above traditional programming languages. UML 2.0 is central to the MDA effort.
It is important to realize the UML is not a software process. It is meant to be used within a software process and has facets clearly intended to be part of an iterative development approach.
While UML was designed to accommodate automated design tools, it wasn’t intended only for tools. Professional whiteboarders were kept in mind when UML was designed, so the language lends itself to quick sketches and capturing “back of the napkin” type designs.
Business Process Modeling
UML has an extensive vocabulary for capturing behavior and process flow. Activity diagrams and statecharts can be used to capture business processes involving individuals, internal groups, or even entire organizations. UML 2.0 has notation that helps model geographic boundaries (activity partitions), worker responsibilities (swim lanes), and complex transactions (statechart diagrams).
Physically, UML is a set of specifications from the OMG. UML 2.0 is distributed as four specifications: the Diagram Interchange Specification , the UML Infrastructure, the UML Superstructure, and the Object Constraint Language (OCL). All of these specifications are available from the OMG web site, http://www.omg.org.
The Diagram Interchange Specification was written to provide a way to share UML models between different modeling tools. Previous versions of UML defined an XML schema for capturing what elements were used in a UML diagram, but did not capture any information about how a diagram was laid out. To address this, the Diagram Interchange Specification was developed along with a mapping from a new XML schema to a Scalable Vector Graphics (SVG) representation. Typically the Diagram Interchange Specification is used only by tool vendors, though the OMG makes an effort to include “whiteboard tools.”
The UML Infrastructure defines the fundamental, low-level, core, bottom-most concepts in UML; the infrastructure is a metamodel that is used to produce the rest of UML. The infrastructure isn’t typically used by an end user, but it provides the foundation for the UML Superstructure.
The UML Superstructure is the formal definition of the elements of UML, and it weighs in at over 600 pages. This is the authority on all that is UML, at least as far as the OMG is concerned. The superstructure documentation is typically used by tool vendors and those writing books on UML, though some effort has been made to make it human readable.
The OCL specification defines a simple language for writing constraints and expressions for elements in a model. The OCL is often brought into play when you specify UML for a particular domain and need to restrict the allowable values for a parameter or object. Appendix B is an overview of the OCL.
It is important to realize that while the specification is the definitive source of the formal definition of UML, it is by no means the be-all and end-all of UML. UML is designed to be extended and interpreted depending on the domain, user, and specific application. There is enough wiggle room in the specification to fit a data center through it... this is intentional. For example, there are typically two or more ways to represent a UML concept depending on what looks best in your diagram or what part of a concept you wish to emphasize. You may choose to represent a particular element using an in-house notation; this is perfectly acceptable as far as UML is concerned. However, you must be careful when using nonstandard notation because part of the reason for using UML in the first place is to have a common representation when collaborating with other users.
Putting UML to Work
A UML model provides a view of a system—often just one of many views needed to actually build or document the complete system. Users new to UML can fall into the trap of trying to model everything about their system with a single diagram and end up missing critical information. Or, at the other extreme, they may try to incorporate every possible UML diagram into their model, thereby overcomplicating things and creating a maintenance nightmare.
Becoming proficient with UML means understanding what each diagram has to offer and knowing when to apply it. There will be many times when a concept could be expressed using any number of diagrams; pick the one(s) that will mean the most to your users.
Each chapter of this book describes a type of diagram and gives examples of its use. There are times when you may need to have more than one diagram to capture all the relevant details for a single part of your system. For example, you may need a statechart diagram to show how an embedded controller processes input from a user as well as a timing diagram to show how the controller interacts with the rest of the system as a result of that input.
You should also consider your audience when creating models. A test engineer may not care about the low-level implementation (sequence diagram) of a component, only the external interfaces it offers (component diagram). Be sure to consider who will be using each diagram you produce and make it meaningful to that person.
In addition to a variety of diagram types, UML is designed to be extended. You can informally extend UML by adding constraints, stereotypes, tagged values, and notes to your models, or you can use the formal UML extension and define a full UML profile. A UML profile is a collection of stereotypes and constraints on elements that map the otherwise generic UML to a specific problem domain or implementation. For example, there are profiles for CORBA, Enterprise Application Integration (EAI), fault tolerance, database modeling, and testing. See Chapter 11 for more information on UML 2.0 Profiles.
It should go without saying that the focus of UML is modeling. However, what that means, exactly, can be an open-ended question. Modeling is a means to capture ideas, relationships, decisions, and requirements in a well-defined notation that can be applied to many different domains. Modeling not only means different things to different people, but also it can use different pieces of UML depending on what you are trying to convey.
In general a UML model is made up of one or more diagrams. A diagram graphically represents things, and the relationships between these things. These things can be representations of real-world objects, pure software constructs, or a description of the behavior of some other object. It is common for an individual thing to show up on multiple diagrams; each diagram represents a particular interest, or view, of the thing being modeled.
UML 2.0 divides diagrams into two categories: structural diagrams and behavioral diagrams. Structural diagrams are used to capture the physical organization of the things in your system—i.e., how one object relates to another. There are several structural diagrams in UML 2.0:
- Class diagrams
Class diagrams use classes and interfaces to capture details about the entities that make up your system and the static relationships between them. Class diagrams are one of the most commonly used UML diagrams, and they vary in detail from fully fleshed-out and able to generate source code to quick sketches on whiteboards and napkins. Class diagrams are discussed in Chapter 2.
- Component diagrams
Component diagrams show the organization and dependencies involved in the implementation of a system. They can group smaller elements, such as classes, into larger, deployable pieces. How much detail you use in component diagrams varies depending on what you are trying to show. Some people simply show the final, deployable version of a system, and others show what functionality is provided by a particular component and how it realizes its functionality internally. Component diagrams are discussed in Chapter 5.
- Composite structure diagrams
Composite structure diagrams are new to UML 2.0. As systems become more complex, the relationships between elements grow in complexity as well. Conceptually, composite structure diagrams link class diagrams and component diagrams; they don’t emphasize the design detail that class diagrams do or the implementation detail that composite structures do. Instead, composite structures show how elements in the system combine to realize complex patterns. Composite structures are discussed in Chapter 4.
- Deployment diagrams
Deployment diagrams show how your system is actually executed and assigned to various pieces of hardware. You typically use deployment diagrams to show how components are configured at runtime. Deployment diagrams are discussed in Chapter 6.
- Package diagrams
Package diagrams are really special types of class diagrams. They use the same notation but their focus is on how classes and interfaces are grouped together. Package diagrams are discussed in Chapter 3.
- Object diagrams
Object diagrams use the same syntax as class diagrams and show how actual instances of classes are related at a specific instance of time. You use object diagrams to show snapshots of the relationships in your system at runtime. Object diagrams are discussed as part of class diagrams in Chapter 2.
Behavioral diagrams focus on the behavior of elements in a system. For example, you can use behavioral diagrams to capture requirements, operations, and internal state changes for elements. The behavioral diagrams are:
- Activity diagrams
Activity diagrams capture the flow from one behavior or activity, to the next. They are similar in concept to a classic flowchart, but are much more expressive. Activity diagrams are discussed in Chapter 9.
- Communication diagrams
Communication diagrams are a type of interaction diagram that focuses on the elements involved in a particular behavior and what messages they pass back and forth. Communication diagrams emphasize the objects involved more than the order and nature of the messages exchanged. Communication diagrams are discussed as part of interaction diagrams in Chapter 10.
- Interaction overview diagrams
Interaction overview diagrams are simplified versions of activity diagrams. Instead of emphasizing the activity at each step, interaction overview diagrams emphasize which element or elements are involved in performing that activity. The UML specification describes interaction diagrams as emphasizing who has the focus of control throughout the execution of a system. Interaction overview diagrams are discussed as part of interaction diagrams in Chapter 10.
- Sequence diagrams
Sequence diagrams are a type of interaction diagram that emphasize the type and order of messages passed between elements during execution. Sequence diagrams are the most common type of interaction diagram and are very intuitive to new users of UML. Sequence diagrams are discussed as part of interaction diagrams in Chapter 10.
- State machine diagrams
State machine diagrams capture the internal state transitions of an element. The element could be as small as a single class or as large as the entire system. State machine diagrams are commonly used to model embedded systems and protocol specifications or implementations. State machine diagrams are discussed in Chapter 8.
- Timing diagrams
Timing diagrams are a type of interaction diagram that emphasize detailed timing specifications for messages. They are often used to model real-time systems such as satellite communication or hardware handshaking. They have specific notation to indicate how long a system has to process or respond to messages, and how external interruptions are factored into execution. Timing diagrams are discussed as part of interaction diagrams in Chapter 10.
- Use case diagrams
Use case diagrams capture functional requirements for a system. They provide an implementation-independent view of what a system is supposed to do and allow the modeler to focus on user needs rather than realization details. Use case diagrams are discussed in Chapter 7.
While not strictly part of UML itself, the concept of views of a system helps the modeler choose diagrams that help convey the correct information depending on his goals. Specifically, models are often divided into what is called the 4+1 views of a system. The 4+1 notation represents four distinct views of a system and one overview of how everything fits together. The four views are:
- Design view
The design view captures the classes, interfaces, and patterns that describe the representation of the problem domain and how the software will be built to address it. The design view almost always uses class diagrams, object diagrams, activity diagrams, composite structure diagrams, and sequence diagrams to convey the design of a system. The design view typically doesn’t address how the system will be implemented or executed.
- Deployment view
The deployment view captures how a system is configured, installed, and executed. It often consists of component diagrams, deployment diagrams, and interaction diagrams. The deployment view captures how the physical layout of the hardware communicates to execute the system, and can be used to show failover, redundancy, and network topology.
- Implementation view
The implementation view emphasizes the components, files, and resources used by a system. Typically the implementation view focuses on the configuration management of a system; what components depend on what, what source files implement what classes, etc. Implementation views almost always use one or more component diagrams and may include interaction diagrams, statechart diagrams, and composite structure diagrams.
- Process view
The process view of a system is intended to capture concurrency, performance, and scalability information. Process views often use some form of interaction diagrams and activity diagrams to show how a system actually behaves at runtime.
The four distinct views of a system are brought together with the final view:
- Use case view
The use case view captures the functionality required by the end users. The concept of end users is deliberately broad in the use case view; they include the primary stakeholders, the system administrator, the testers, and potentially the developers themselves. The use case view is often broken down into collaborations that link a use case with one or more of the four basic views. The use case view includes use case diagrams and typically uses several interaction diagrams to show use case details.
UML provides a catchall element, or note, for adding information to your diagram. The note symbol is a dog-eared rectangle with an optional dashed line to link it to some element. Figure 1-1 shows a simple note.
In general, you can use notes to capture just about anything in your diagram. Notes are often used to express additional information that either doesn’t have its own notation or would clutter a diagram if you drew it right on the element. Some tools allow you to embed URL links in notes, providing an easy way to navigate from one diagram to the next, or to HTML documents, etc.
Classifiers and Adornments
The basic modeling element in UML is the classifier. A classifier represents a group of things with common properties. Remember, at the level of classifier, we are discussing UML itself, not a particular system. So, when we say a class is a classifier, we mean that classes are things that have common properties: methods, attributes, exceptions, visibility, etc. A specific class, such as
Automobile, isn’t a UML classifier; it’s an instance of a classifier, or a class.
For the truly self-abusing, this is a glimpse into the UML meta-model. The full metamodel is quite complex and begins with the UML infrastructure specification.
A classifier’s generic notation is a rectangle that can be divided into compartments to show classifier-specific information, such as operations, attributes, or state activities. However, many UML classifiers such as states, activities, objects, etc., have custom notations to help distinguish them visually.
A classifier can have several types of extra information attached to it via a UML mechanism called adornments . For example, classifiers can have restrictions placed on the values a feature of the classifier can take. In general, constraints are written near the classifier or in an attached note. See the specific diagram types for details on what notation to use for a constraint when writing it near the classifier.
Another type of adornment is a stereotype. Just as you would expect, a stereotype is intended to give the reader a general idea of what a particular classifier represents. Stereotypes are usually associated with implementation concepts, such as
«singleton», though that isn’t required by the UML specification.
UML Rules of Thumb
While UML provides a common language for capturing functionality and design information, it is deliberately open-ended to allow for the flexibility needed to model different domains. There are several rules of thumb to keep in mind when using UML:
- Nearly everything in UML is optional
UML provides a language to capture information that varies greatly depending on the domain of the problem. In doing that, there are often parts of UML that either don’t apply to your particular problem or may not lend anything to the particular view you are trying to convey. It is important to realize that you don’t need to use every part of UML in every model you create. Possibly even more importantly, you don’t need to use every allowable symbol for a diagram type in every diagram you create. Show only what helps clarify the message you are trying to convey, and leave off what you don’t need. At times there is more than one way to convey the same information; use what is familiar to your audience.
- UML models are rarely complete
As a consequence of everything being optional, it is common for a UML model to be missing some details about a system. The trick is to not miss key details that could impact your system design. Knowing what is a key detail versus extraneous information comes with experience; however, using an iterative process and revisiting your model helps to flesh out what needs to be there. As UML moves closer to tool automation with practices like MDA and Software Factories, the models often become more and more detailed and therefore complete. The difference is the tool support that helps vary the level of abstraction depending on your needs.
- UML is designed to be open to interpretation
While the UML specification does a good job of laying down the groundwork for a modeling language, it is critical that within an organization or group of users you establish how and when to use a language feature. For example, some organizations use an aggregation relationship to indicate a C++ pointer and a composition relationship to indicate a C++ reference. There is nothing inherently wrong with this distinction, but it’s something that isn’t going to be immediately obvious to someone not familiar with that organization’s modeling technique. It is a good practice to put together a document on modeling guidelines; it helps novice users get up to speed quicker and helps experienced users really think about how they represent something and consider a potentially better notation.
- UML is intended to be extended
UML includes several mechanisms to allow customization and refinement of the language. Such mechanisms as adornments, constraints, and stereotypes provide ways to capture specific details that aren’t easily expressed using classifiers and relationships. Typically these are grouped into what are known as UML profiles. For example, you can put together a Java 2 Enterprise Edition (J2EE) profile that includes stereotypes for
javadataobject. If you are modeling a complex domain, consider putting together a UML profile that lets you easily identify elements as concepts in your domain, such as