Th Group: parallel processing

All real-time applications, including operating systems, need some form of parallel processing. Each process needs to define scheduling and the system resources it uses or destroys. Processes are normally structured hierarchically but non-hierarchical structures provide increased reliability and flexibility. In real-time systems, the user is also a process interacting asynchronously with the system. (cbb 5/80)

Group members

Topic: action chunks
Topic: asynchronous processing
Topic: communicating sequential processes
Topic: concurrent while in parallel processing
Topic: data parallel processing
Topic: defining a process
Topic: hard real time systems
Topic: high priority processes
Topic: interrupts
Topic: interprocess communication
Topic: lock-free concurrency
Topic: shared memory
Topic: massively parallel processors
Topic: models of parallel computation
Topic: non-preemptive task scheduling
Topic: parallel algorithms
Topic: pipeline processing
Topic: plan-based task scheduling
Topic: process migration
Topic: calculus of communicating processes
Topic: race conditions
Topic: real time systems
Topic: synchronous communication through a channel
Topic: task scheduling
Topic: waitfor condition in parallel processing
Topic: concurrency
Topic: concurrency control by monitors
Topic: concurrent operations
Topic: critical regions
Topic: deadlocks
Topic: events
Topic: event controlled processing
Topic: monitored statements and events
Topic: multi-tasking
Topic: non-deterministic processing
Topic: parallel control statements
Topic: Petri net
Topic: Petri net transitions and events
Topic: process threads
Topic: remote procedure call
Topic: synchronized processing
Topic: vector processing

Subtopic: idealized parallelism

Quote: ideal reactive systems produce outputs simultaneously with inputs; allows decomposition without affecting behavior, timing, or interleaving [benvA9_1991]

Subtopic: concurrent algorithms

Quote: concurrent, incremental memory compressor with short pauses and one heap pass; requires 3x storage [kermH6_2006]
Quote: for concurrent operations on B-trees, separate safe from unsafe nodes; an update to a safe node does not affect its ancestors [bayeR_1977]
Quote: algorithm for concurrent, tunable indexing of B*-trees; deadlock free [bayeR_1977]
Quote: survey of optimistic replication algorithms; background propagation of change; supports concurrent work and allows for unreliable communications [saitY3_2005]

Subtopic: deterministic parallel processing

Quote: deterministic parallel programming by annotating a sequential program; 16x speedup on 24 cores; a task annotation identifies a block for potential, out-of-order execution [jeniJC2_2011]
Quote: parallel programs should always run deterministically; easier to collect crash and bug reports from users [deviJ3_2009]
Quote: replay log as a total order of program chunks; e.g., 1000 instructions; DeLorean's BulkSC execution mode [montP3_2009]

Subtopic: many-core processors

Quote: concurrent garbage collection of hundreds of thousands of actors for a high-throughput, low-latency application; 10 nanoseconds per block and unblock message [clebS10_2013]
Quote: threads on the same core can help each other through the low-level cache (LLC) by prefetching instructions and data [zhurS11_2012]
Quote: multicore processors have much richer interleaving than single core processors; will lead to new system failures [leeEA5_2006]
Quote: allocate a core's page frames from different memory banks; randomize access to memory banks; 40% faster for memory intensive benchmarks [parkH3_2013]
Quote: memory will be the bottleneck for many-core processors [guptD10_2010]

Subtopic: yield point

Quote: a program is cooperable if all thread interferences are annotated with 'yield'; between yield points, sequential reasoning is correct by default [yiJ2_2011]

Subtopic: inter-program communication

Quote: real-time systems by connecting programs via the operating system, using finite-state machine or Petri nets, using concurrent programming languages [benvA9_1991]
Quote: UNIX processes communicate via shared open files, program arguments, exit codes, predefined files, and pipes [mashJR_1976]

Subtopic: process pair

Quote: a process-pair has a long MTBF when a software module has a MTBF of a year; a third process does not improve MTBF due to operator failures, etc. [grayJ11_1985]

Subtopic: map reduce

Quote: River Trail's data-parallel API defines ParallelArray of plain-old types, parallel methods such as map, reduce, scan and filter, and side-effect free, elemental functions that process each array element [herhS10_2013]
Quote: MapReduce automatically parallelizes a distributed computation defined by a map function to key/value pairs and a reduce operation over values that share the same key [deanJ1_2008]
Quote: MapReduce splits input into M large pieces; each worker maps and processes the key/value pairs into R regions; the reduce worker sorts and reduces a region to an output file; size reduced at each step [deanJ1_2008]

Subtopic: scaling

Quote: the scalable commutativity rule -- whenever interface operations commute, they can be implemented in a way that scales; start a new design with the design of scalable software interfaces [clemAT8_2017]
Quote: efficient synchronization methods for large-scale, shared mutable data [morrA11_16]
Quote: Sawzall has near perfect scaling; e.g., processed 450 GB compressed query logs at 0.98 performance/machine [pikeR4_2005]
Quote: linear scaling and simplicity important for Web applications; allows high volumes and throughput [boswA10_2005]

Subtopic: message passing vs. shared data

Quote: procedural message passing is more complicated than data-parallel or shared-variable programs; use an application builder [wilsGV_1995]

Subtopic: process state

Quote: a process is either active or it is suspended, awaiting activation by an external event such as I/O [dennJB3_1966]

Subtopic: processes vs. queuing

Quote: Acme uses a new process for each I/O request; its state encodes the state of the I/O request and removes the need for queues [pikeR1_1994]

Subtopic: resource acquisition

Quote: resource acquisition is initialization. Resources, like objects, are released in the reverse order of their acquisition; works well with exception handling [stroB_1991x]

Subtopic: GPU

Quote: multisplit is a parallel primitive for GPUs that permutes its input data into contiguous buckets or bins; 3x-7x speedup over radix sort; also warp-synchronous programming and hierarchical reordering [ashkS3_2016]

Subtopic: restricted parallelism

Quote: each context level acts as a single mutex lock and thus prevents cycles in the invocation structure [maysD5_2002]

Subtopic: dynamic parallelism

Quote: dynamic parallelization by annotating code blocks for separate execution; heap examiners use pointer analysis to determine safe execution [eomY2_2012]

Subtopic: parallel libraries

Quote: Lithe provides a unvirtualized hardware thread primitives (hart) to each parallel library; allows composability without overhead [panH6_2010]
Quote: parallel LINQ uses the task parallel library with work stealing and duplicating queues [leijD10_2009]

Subtopic: memory allocation for parallel processes

Quote: Hoard memory allocator for parallel C and C++ programs; one global heap and per-processor heaps with low synchronization costs; up to 18x better [bergED11_2000]

Subtopic: instruction-level parallelism

Quote: C programs expect the same abstract machine as in a PDP-11, but high-performance processors rely on instruction-level parallelism; modern cores have up to 180 instructions in flight at the same time [chisD7_2018]

Subtopic: state machine parallelism

Quote: data-parallel finite-state machines for multi-core processors; efficiently enumerates transitions from all possible states on each input symbol [mytkT3_2014]
Quote: an ESP context switch only saves the program counter; no stack needed for a state machine [kumaS6_2001]
Quote: the ESP compiler extracts Spin models for checking; can often fully debug with the model checker [kumaS6_2002]
Quote: reimplemented VMMC using 500 lines of ESP and 3000 lines of C; replaced 15,600 lines; exhaustively checked the ESP code [kumaS6_2002]

Subtopic: model checking

Quote: logic model checking is effective for concurrency-related errors; based on temporal logic as proposed by Pnueli [holzGJ11_2002]
Quote: the TimeLine graphical editor simplifies the specification of basic temporal properties for SPIN model checkers; linear temporal logic is difficult to use [holzGJ11_2002]

Subtopic: race detection

Quote: synchronization patterns and type annotation for static race detection; found errors in Java libraries; 20 annotations per 1000 lines [flanC6_2000]

Subtopic: power management

Quote: hardware design and power management need to consider parallel workloads; sequential workloads do not approximate managed workloads or native parallel workloads [esmaH7_2012]
Quote: energy savings through simultaneous multithreading (SMT); provides fine-grain parallelism to distinct threads while sharing processor hardware [esmaH7_2012]

Subtopic: efficiency

Quote: efficient heap management via a separate process; bulk deallocate, pre-allocation of multiple objects [kharM10_2006]

Subtopic: design for parallel

Quote: parallelizing requires reorganization; e.g., producer-consumer or master-worker [pankV11_2009]
Quote: design for parallel -- modular tasks, no side effects, parameterize functions for division of work, thread-safe, parallel design patterns [pankV11_2009]
Quote: parallel design patterns include master-worker, pipeline, producer-consumer [pankV11_2009]

Subtopic: history of parallel processing

Quote: the Apollo Guidance Computer was programmable with a parallel processor, 16-bit words, and real-time digital control; Block II multiply in 47.8 microseconds, 70 lb., and 55W power [hallEC_1996]
Quote: multi-programming needs meta-instructions for parallel processing, naming, and protection [dennJB3_1966]
Quote: a process is a locus of control, an abstract entity that moves through the instructions [dennJB3_1966]
Quote: can interpose additional computation for a quarter of the cycles; reduces time to 9.9 seconds [hoppG_1946]

Subtopic: serial vs. parallel

Quote: processing of sensory input is serial or nearly so; this limits our capacity to process multiple inputs arriving in quick succession [chriMH1_2016]

Subtopic: problems with parallel processing

Quote: most performance bugs caused by inefficient function-call combinations, skippable functions, and unnecessary synchronization [jinG6_2012]
Quote: the problems of multithreaded programming are due to its preemptive semantics; one thread may interfer with another thread, creating a non-atomic read-modify-write [yiJ2_2011]
Quote: parallelizing sequential code requires significant modification; e.g., bzip has side effects, data dependencies, and optimizations [pankV11_2009]
Quote: the most common synchronization error was ignoring read synchronization; data may be out-of-date or uninitialized [hoveD7_2004]
Quote: programming with threads is difficult; misuse of concurrency is widespread; use a bug checker to catch concurrency problems [hoveD12_2004]
Quote: simple analysis find obvious synchronization errors; programmers do not understand synchronization and they avoid using synchronization [hoveD7_2004]
Quote: Smartcom for Windows abandoned concurrent programming due to difficulties with Smartcom III; problems remained with efficiency, non-reentrant message handlers, and unbounded message recursion [maysD5_2002]
Quote: threaded programs are extremely difficult to prove equivalent; due to the enormous number of possible interleavings [leeEA5_2006]
Quote: threads are wildly nondeterministic; seriously flawed as a computation model [leeEA5_2006]
Quote: threaded programs rely on programming style to constrain their nondeterminism [leeEA5_2006]
Quote: keep number of threads near the number of CPUs [smaaB2_2006]

Group: coordination system
Group: digital communication
Group: distributed systems
Group: machine model
Group: operating system
Group: process control
Group: relationship between brain and behavior
Group: robots
-
Topic: a single system image
Topic: artificial neuron nets
Topic: broadcasting information
Topic: co-routines
Topic: computer architecture
Topic: coordinated processes
Topic: interrupt handler
Topic: memory model
Topic: mobile code
Topic: monitored variable
Topic: multi-processing for a user interface
Topic: multi-user systems
Topic: multiple activities in a user interface
Topic: multiple processors
Topic: parallel programming languages
Topic: proving concurrent programs
Topic: shared objects
Topic: specification and design of distributed systems
Topic: testing multithreaded code
Topic: transactional memory

Subtopics

concurrent algorithms
design for parallel
deterministic parallel processing
dynamic parallelism
efficiency
GPU
history of parallel processing
idealized parallelism
instruction-level parallelism
inter-program communication
many-core processors
map reduce
memory allocation for parallel processes
message passing vs. shared data
model checking
parallel libraries
power management
problems with parallel processing
process pair
process state
processes vs. queuing
race detection
resource acquisition
restricted parallelism
scaling
serial vs. parallel
state machine parallelism
yield point

Group: parallel processing

Summary

Group members

Related

Subtopics