O'Reilly logo

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Mastering Concurrency Programming with Java 9 - Second Edition

Book Description

Master the principles to make applications robust, scalable and responsive

About This Book

  • Implement concurrent applications using the Java 9 Concurrency API and its new components
  • Improve the performance of your applications and process more data at the same time, taking advantage of all of your resources
  • Construct real-world examples related to machine learning, data mining, natural language processing, and more

Who This Book Is For

This book is for competent Java developers who have basic understanding of concurrency, but knowledge of effective implementation of concurrent programs or usage of streams for making processes more efficient is not required

What You Will Learn

  • Master the principles that every concurrent application must follow
  • See how to parallelize a sequential algorithm to obtain better performance without data inconsistencies and deadlocks
  • Get the most from the Java Concurrency API components
  • Separate the thread management from the rest of the application with the Executor component
  • Execute phased-based tasks in an efficient way with the Phaser components
  • Solve problems using a parallelized version of the divide and conquer paradigm with the Fork / Join framework
  • Find out how to use parallel Streams and Reactive Streams
  • Implement the “map and reduce” and “map and collect” programming models
  • Control the concurrent data structures and synchronization mechanisms provided by the Java Concurrency API
  • Implement efficient solutions for some actual problems such as data mining, machine learning, and more

In Detail

Concurrency programming allows several large tasks to be divided into smaller sub-tasks, which are further processed as individual tasks that run in parallel. Java 9 includes a comprehensive API with lots of ready-to-use components for easily implementing powerful concurrency applications, but with high flexibility so you can adapt these components to your needs.

The book starts with a full description of the design principles of concurrent applications and explains how to parallelize a sequential algorithm. You will then be introduced to Threads and Runnables, which are an integral part of Java 9's concurrency API. You will see how to use all the components of the Java concurrency API, from the basics to the most advanced techniques, and will implement them in powerful real-world concurrency applications.

The book ends with a detailed description of the tools and techniques you can use to test a concurrent Java application, along with a brief insight into other concurrency mechanisms in JVM.

Style and approach

This is a complete guide that implements real-world examples of algorithms related to machine learning, data mining, and natural language processing in client/server environments. All the examples are explained using a step-by-step approach.

Downloading the example code for this book. You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the code file.

Table of Contents

  1. Preface
    1. What this book covers
    2. What you need for this book
    3. Who this book is for
    4. Conventions
    5. Reader feedback
    6. Customer support
      1. Downloading the example code
      2. Errata
      3. Piracy
      4. Questions
  2. The First Step - Concurrency Design Principles
    1. Basic concurrency concepts
      1. Concurrency versus parallelism
      2. Synchronization
      3. Immutable object
      4. Atomic operations and variables
      5. Shared memory versus message passing
    2. Possible problems in concurrent applications
      1. Data race
      2. Deadlock
      3. Livelock
      4. Resource starvation
      5. Priority inversion
    3. A methodology to design concurrent algorithms
      1. The starting point - a sequential version of the algorithm
      2. Step 1 - analysis
      3. Step 2 - design
      4. Step 3 - implementation
      5. Step 4 - testing
      6. Step 5 - tuning
      7. Conclusion
    4. Java Concurrency API
      1. Basic concurrency classes
      2. Synchronization mechanisms
      3. Executors
      4. The fork/join framework
      5. Parallel streams
      6. Concurrent data structures
    5. Concurrency design patterns
      1. Signaling
      2. Rendezvous
      3. Mutex
      4. Multiplex
      5. Barrier
      6. Double-checked locking
      7. Read-write lock
      8. Thread pool
      9. Thread local storage
    6. Tips and tricks for designing concurrent algorithms
      1. Identifying the correct independent tasks
      2. Implementing concurrency at the highest possible level
      3. Taking scalability into account
      4. Using thread-safe APIs
      5. Never assume an execution order
      6. Preferring local thread variables over static and shared when possible
      7. Finding the easier parallelizable version of the algorithm
      8. Using immutable objects when possible
      9. Avoiding deadlocks by ordering the locks
      10. Using atomic variables instead of synchronization
      11. Holding locks for as short a time as possible
      12. Taking precautions using lazy initialization
      13. Avoiding the use of blocking operations inside a critical section
    7. Summary
  3. Working with Basic Elements - Threads and Runnables
    1. Threads in Java
      1. Threads in Java - characteristics and states
      2. The Thread class and the Runnable interface
    2. First example: matrix multiplication
      1. Common classes
      2. Serial version
      3. Parallel versions
        1. First concurrent version - a thread per element
        2. Second concurrent version - a thread per row
        3. Third concurrent version - the number of threads is determined by the processors
      4. Comparing the solutions
    3. Second example - file search
      1. Common classes
      2. Serial version
      3. Concurrent version
      4. Comparing the solutions
    4. Summary
  4. Managing Lots of Threads - Executors
    1. An introduction to executors
      1. Basic characteristics of executors
      2. Basic components of the Executor framework
    2. First example - the k-nearest neighbors algorithm
      1. k-nearest neighbors - serial version
      2. K-nearest neighbors - a fine-grained concurrent version
      3. k-nearest neighbors - a coarse-grained concurrent version
      4. Comparing the solutions
    3. Second example - concurrency in a client/server environment
      1. Client/server - serial version
        1. The DAO part
        2. The command part
        3. The server part
      2. Client/version - parallel version
        1. The server part
        2. The command part
      3. Extra components of the concurrent server
        1. The status command
        2. The cache system
        3. The log system
        4. Comparing the two solutions
      4. Other methods of interest
    4. Summary
  5. Getting the Most from Executors
    1. Advanced characteristics of executors
      1. Cancellation of tasks
      2. Scheduling the execution of tasks
      3. Overriding the executor methods
      4. Changing some initialization parameters
    2. First example - an advanced server application
      1. The ServerExecutor class
        1. The statistics object
        2. The rejected task controller
        3. The executor tasks
        4. The executor
      2. The command classes
        1. The ConcurrentCommand class
        2. The concrete commands
      3. The server part
        1. The ConcurrentServer class
        2. The RequestTask class
      4. The client part
    3. Second example - executing periodic tasks
      1. The common parts
      2. The basic reader
      3. The advanced reader
    4. Additional information about executors
    5. Summary
  6. Getting Data from Tasks - The Callable and Future Interfaces
    1. Introducing the Callable and Future interfaces
      1. The Callable interface
      2. The Future interface
    2. First example - a best-matching algorithm for words
      1. The common classes
      2. A best-matching algorithm - the serial version
        1. The BestMatchingSerialCalculation class
        2. The BestMachingSerialMain class
      3. A best-matching algorithm - the first concurrent version
        1. The BestMatchingBasicTask class
        2. The BestMatchingBasicConcurrentCalculation class
      4. A best-matching algorithm - the second concurrent version
      5. Word exists algorithm - a serial version
        1. The ExistSerialCalculation class
        2. The ExistSerialMain class
      6. Word exists algorithm - the concurrent version
        1. The ExistBasicTasks class
        2. The ExistBasicConcurrentCalculation class
        3. The ExistBasicConcurrentMain class
      7. Comparing the solutions
        1. Best-matching algorithms
        2. Exist algorithms
    3. The second example - creating an inverted index for a collection of documents
      1. Common classes
        1. The Document class
        2. The DocumentParser class
      2. The serial version
      3. The first concurrent version - a task per document
        1. The IndexingTask class
        2. The InvertedIndexTask class
        3. The ConcurrentIndexing class
      4. The second concurrent version - multiple documents per task
        1. The MultipleIndexingTask class
        2. The MultipleInvertedIndexTask class
        3. The MultipleConcurrentIndexing class
      5. Comparing the solutions
      6. Other methods of interest
    4. Summary
  7. Running Tasks Divided into Phases - The Phaser Class
    1. An introduction to the Phaser class
      1. Registration and deregistration of participants
      2. Synchronizing phase change
      3. Other functionalities
    2. First example - a keyword extraction algorithm
      1. Common classes
        1. The Word class
        2. The Keyword class
        3. The Document class
        4. The DocumentParser class
      2. The serial version
      3. The concurrent version
        1. The KeywordExtractionTask class
        2. The ConcurrentKeywordExtraction class
      4. Comparing the two solutions
    3. The second example - a genetic algorithm
      1. Common classes
        1. The Individual class
        2. The GeneticOperators class
      2. The serial version
        1. The SerialGeneticAlgorithm class
        2. The SerialMain class
      3. The concurrent version
        1. The SharedData class
        2. The GeneticPhaser class
        3. The ConcurrentGeneticTask class
        4. The ConcurrentGeneticAlgorithm class
        5. The ConcurrentMain class
      4. Comparing the two solutions
        1. Lau15 dataset
        2. Kn57 dataset
        3. Conclusions
    4. Summary
  8. Optimizing Divide and Conquer Solutions - The Fork/Join Framework
    1. An introduction to the fork/join framework
      1. Basic characteristics of the fork/join framework
      2. Limitations of the fork/join framework
      3. Components of the fork/join framework
    2. The first example - the k-means clustering algorithm
      1. The common classes
        1. The VocabularyLoader class
        2. The word, document, and DocumentLoader classes
        3. The DistanceMeasurer class
        4. The DocumentCluster class
      2. The serial version
        1. The SerialKMeans class
        2. The SerialMain class
      3. The concurrent version
        1. Two tasks for the fork/join framework - AssignmentTask and UpdateTask
        2. The ConcurrentKMeans class
        3. The ConcurrentMain class
      4. Comparing the solutions
    3. The second example - a data filtering algorithm
      1. Common features
      2. The serial version
        1. The SerialSearch class
        2. The SerialMain class
      3. The concurrent version
        1. The TaskManager class
        2. The IndividualTask class
        3. The ListTask class
        4. The ConcurrentSearch class
        5. The ConcurrentMain class
      4. Comparing the two versions
    4. The third example - the merge sort algorithm
      1. Shared classes
      2. The serial version
        1. The SerialMergeSort class
        2. The SerialMetaData class
      3. The concurrent version
        1. The MergeSortTask class
        2. The ConcurrentMergeSort class
        3. The ConcurrentMetaData class
      4. Comparing the two versions
    5. Other methods of the fork/join framework
    6. Summary
  9. Processing Massive Datasets with Parallel Streams - The Map and Reduce Model
    1. An introduction to streams
      1. Basic characteristics of streams
      2. Sections of a stream
        1. Sources of a stream
        2. Intermediate operations
        3. Terminal operations
      3. MapReduce versus MapCollect
    2. The first example - a numerical summarization application
      1. The concurrent version
        1. The ConcurrentDataLoader class
        2. The ConcurrentStatistics class
          1. Customers from the United Kingdom
          2. Quantity from the United Kingdom
        3. Countries for product
          1. Quantity for product
          2. Multiple data filter
          3. Highest invoice amounts
          4. Products with a unit price between 1 and 10
        4. The ConcurrentMain class
      2. The serial version
      3. Comparing the two versions
    3. The second example - an information retrieval search tool
      1. An introduction to the reduction operation
      2. The first approach - full document query
        1. The basicMapper() method
        2. The Token class
        3. The QueryResult class
      3. The second approach - reduced document query
        1. The limitedMapper() method
      4. The third approach - generating an HTML file with the results
        1. The ContentMapper class
      5. The fourth approach - preloading the inverted index
        1. The ConcurrentFileLoader class
      6. The fifth approach - using our own executor
      7. Getting data from the inverted index - the ConcurrentData class
      8. Getting the number of words in a file
      9. Getting the average tfxidf value in a file
      10. Getting the maximum and minimum tfxidf values in the index
      11. The ConcurrentMain class
      12. The serial version
      13. Comparing the solutions
    4. Summary
  10. Processing Massive Datasets with Parallel Streams - The Map and Collect Model
    1. Using streams to collect data
      1. The collect() method
    2. The first example - searching data without an index
      1. Basic classes
        1. The Product class
        2. The Review class
        3. The ProductLoader class
      2. The first approach - basic search
        1. The ConcurrentStringAccumulator class
      3. The second approach - advanced search
        1. The ConcurrentObjectAccumulator class
      4. A serial implementation of the example
      5. Comparing the implementations
    3. The second example - a recommendation system
      1. Common classes
        1. The ProductReview class
        2. The ProductRecommendation class
      2. Recommendation system - the main class
      3. The ConcurrentLoaderAccumulator class
      4. The serial version
      5. Comparing the two versions
    4. The third example - common contacts in a social network
      1. Base classes
        1. The Person class
        2. The PersonPair class
        3. The DataLoader class
      2. The concurrent version
        1. The CommonPersonMapper class
        2. The ConcurrentSocialNetwork class
        3. The ConcurrentMain class
      3. The serial version
        1. Comparing the two versions
    5. Summary
  11. Asynchronous Stream Processing - Reactive Streams
    1. Introduction to reactive streams in Java
      1. The Flow.Publisher interface
      2. The Flow.Subscriber interface
      3. The Flow.Subscription interface
      4. The SubmissionPublisher class
    2. The first example - a centralized system for event notification
      1. The Event class
      2. The Producer class
      3. The Consumer class
      4. The Main class
    3. The second example - a news system
      1. The News class
      2. The publisher classes
      3. The Consumer class
      4. The Main class
    4. Summary
  12. Diving into Concurrent Data Structures and Synchronization Utilities
    1. Concurrent data structures
      1. Blocking and non-blocking data structures
      2. Concurrent data structures
        1. Interfaces
          1. BlockingQueue
          2. BlockingDeque
          3. ConcurrentMap
          4. TransferQueue
        2. Classes
          1. LinkedBlockingQueue
          2. ConcurrentLinkedQueue
          3. LinkedBlockingDeque
          4. ConcurrentLinkedDeque
          5. ArrayBlockingQueue
          6. DelayQueue
          7. LinkedTransferQueue
          8. PriorityBlockingQueue
          9. ConcurrentHashMap
      3. Using the new features
        1. First example with ConcurrentHashMap
          1. The forEach() method
          2. The search() method
          3. The reduce() method
          4. The compute() method
        2. Another example with ConcurrentHashMap
        3. An example with the ConcurrentLinkedDeque class
          1. The removeIf() method
          2. The spliterator() method
      4. Atomic variables
      5. Variable handles
    2. Synchronization mechanisms
      1. The CommonTask class
      2. The Lock interface
      3. The Semaphore class
      4. The CountDownLatch class
      5. The CyclicBarrier class
      6. The CompletableFuture class
        1. Using the CompletableFuture class
          1. Auxiliary tasks
          2. The main() method
    3. Summary
  13. Testing and Monitoring Concurrent Applications
    1. Monitoring concurrency objects
      1. Monitoring a thread
      2. Monitoring a lock
      3. Monitoring an executor
      4. Monitoring the fork/join framework
      5. Monitoring a Phaser
      6. Monitoring the Stream API
    2. Monitoring concurrency applications
      1. The Overview tab
      2. The Memory tab
      3. The Threads tab
      4. The Classes tab
      5. The VM summary tab
      6. The MBeans tab
      7. The About tab
    3. Testing concurrency applications
      1. Testing concurrent applications with MultithreadedTC
      2. Testing concurrent applications with Java Pathfinder
        1. Installing Java Pathfinder
        2. Running Java Pathfinder
    4. Summary
  14. Concurrency in JVM - Clojure and Groovy with the Gpars Library and Scala
    1. Concurrency in Clojure
      1. Using Java elements
      2. Reference types
        1. Atoms
        2. Agents
      3. Refs
      4. Delays
      5. Futures
      6. Promises
    2. Concurrency in Groovy with the GPars library
    3. Software transactional memory
      1. Using Java elements
      2. Data parallelism
      3. The fork/join processing
      4. Actors
      5. Agent
      6. Dataflow
    4. Concurrency in Scala
      1. Future objects in Scala
      2. Promises
    5. Summary