DeepakBhalla

Software Architecture General Reference

These notes are references to the high level structures of a software system. They are mainly a list of key terms used as a reference for other notes.

Terms

Eventual consistency - A consistency model that guarantees that if no new updates are made to a given data item, eventually all accesses to that item will return the last updated value.

Design Stamina Hypothesis - As time goes on adding more features to a codebase becomes more painful. With good architecture (refactoring regularly, keeping the codebase healthy) - you can reverse or attenuate this factor

Software Architecture - More like city planning than architecture (UX designers are more like Architects)

Atomic Variables - Consider a program with two threads. One thread processes some list of files and increments a counter each time it finishes working on one of them. The other thread handles the user interface, and will periodically read the counter to update a progress bar. If that counter is a 64-bit integer, we have a problem on 32-bit machines, since two loads or stores are needed to read or write the entire value. If we’re having a particularly unlucky time, the first thread could be halfway through writing the counter when the second thread reads it, receiving an incorrect value. These unfortunate occasions are called torn reads and writes.

Synchronization

  • Blocking synchronization - threads can be made to wait an arbitrary amount of time (maybe via a mutex). If some thread locks the mutex and another attempts to do the same, the second thread is made to wait—or block—until the first thread releases the lock, however long that may be.

  • Lockless synchronization - ensure that at least one thread is always making progress. They are non-blocking since no thread can cause another to wait indefinitely. i.e. embedded system where a sensor invokes an interrupt service routine (isr) when new data arrives.

    • Solutions needed for atomic operations, memory barriers, avoiding ABA problem

      • Atomic operations guarnatee that no half-writes or half-reads can be read (i.e a 32 bit register writing/reading a 64 bit value) could encounter a half read during an operation
      • Declaring variables atomic guarantees sequential consistency, that all threads agree on the order in which memory operations occurred, and that order is consistent with the order of operations in the program source code

Event Based Systems

Event Collaboration - Instead of components making requests when they need something, components raise events when things change. Other components then listen to events and react appropriately.

  • One difference is the way the event is named. Rather than phrasing it the form that says it's telling the recipient to do something, it takes the form of saying that an event as happened - inviting those interested to respond.

Useful Patterns

Command & Query Responsibility Segregation (CQRS)

  • Segregate operations that read data from operations that update data by using separate interfaces.

  • Typically in these systems, all create, read, update, and delete (CRUD) operations are applied to the same representation of the entity. Traditional CRUD designs work well when only limited business logic is applied to the data operations.

  • The CQRS pattern that segregates the operations that read data (queries) from the operations that update data (commands) by using separate interfaces. This means that the data models used for querying and updates are different.

    • Can easily move to a task-based UI.

    • Fits well with event-based programming models. Common to see CQRS system split into separate services communicating with Event Collaboration

Database Design

ACID (Atomicity, Consistency, Isolation and Durability) properties are the key invariants that have to be enforced for a transaction to be reliably implemented without causing undesirable side effects.

  • Atomicity requires that the transaction complete or rollback completely. Partially finished transactions should never be visible, and the system has to be built in a way that prevents this from happening.

  • Consistency requires that a transaction should never violate any invariants (such as declarative referential integrity) that are guaranteed by the database schema. For example, if a foreign key exists it should be impossible to insert a child record with a reverence to a non-existent parent.

  • Isolation requires that transactions should not interfere with each other. The system should guarantee the same results if the transactions are executed in parallel or sequentially. In practice most RDBMS products allow modes that trade off isolation against performance.

  • Durability requires that once committed, the transaction remains in persistent storage in a way that is robust to hardware or software failure.

Shared Disk Architecture: An architecture in which all processing nodes in a cluster have access to all of the storage. This can present a central bottleneck for data access. An example of a shared-disk system is Oracle RAC or Exadata.

Shared Nothing Architecture: An architecture in which processing nodes in a cluster have local storage that is not visible to other cluster nodes. Examples of shared-nothing systems are Teradata and Netezza.

Shared Memory Architecture: An architecture in which multiple CPUs (or nodes) can access a shared pool of memory. Most modern servers are of a shared memory type. Shared memory facilitates certain operations such as caches or atomic synchronisation primitives that are much harder to do on distributed systems.

Synchronisation: A generic term describing various methods for ensuring consistent access to a shared resource by multiple processes or threads. This is much harder to do on distributed systems than on shared memory systems, although some network architectures (e.g. Teradata's BYNET) had synchronisation primitives in the network protocol. Synchronisation can also come with a significant amount of overhead.

Semi-Join: A primitive used in joining data held in two different nodes of a distributed system. Essentially it consists of enough information about the rows to join being bundled up and passed by one node to the other in order to resolve the join. On a large query this could involve significant network traffic.

Eventual Consistency: A term used to describe transaction semantics that trade off immediate update (consistency on reads) on all nodes of a distributed system for performance (and therefore higher transaction throughput) on writes. Eventual consistency is a side effect of using Quorum Replication as a performance optimisation to speed up transaction commits in distributed databases where multiple copies of data are held on separate nodes.

Lamport's Algorithm: An algorithm for implementing mutual exclusion (synchronisation) across systems with no shared memory. Normally mutual exclusion within a system requires an atomic read-compare-write or similar instruction of a type normally only practical on a shared memory system. Other distributed synchronisation algorithms exist, but Lamport's was one of the first and is the best known. Like most distributed synchronisation mechanisms, Lamport's algorithm is heavily dependent on accurate timing and clock synchronisation beteen cluster nodes.

Two Phase Commit (2PC): A family of protocols that ensure that database updates involving multiple physical systems commit or roll back consistently. Whether 2PC is used within a system or across multiple systems via a transaction manager it carries a significant overhead. In a two-phase commit protocol the transaction manager asks the participating nodes to persist the transaction in such a way that they can guarantee that it will commit, then signal this status. When all nodes have returned a 'happy' status it then signals the nodes to commit. The transaction is still regarded as open until all of the nodes send a reply indicating the commit is complete. If a node goes down before signalling the commit is complete the transaction manager will re-query the node when it comes back up until it gets a positive reply indicating the transaction has committed.

Multi-Version Concurrency Control (MVCC): Managing contention by writing new versions of the data to a different location and allowing other transactions to see the old version of the data until the new version is committed. This reduces database contention at the expense of some additional write traffic to write the new version and then mark the old version as obsolete.

Election Algorithm: Distributed systems involving multiple nodes are inherently less reliable than a single system as there are more failure modes. In many cases some mechanism is needed for clustered systems to deal with failure of a node. Election algorithms are a class of algorithms used to select a leader to coordinate a distributed computation in situations where the 'leader' node is not 100% determined or reliable.

Horizontal Partitioning: A table may be split across multiple nodes or storage volumes by its key. This allows a large data volume to be split into smaller chunks and distributed across storage nodes.

Sharding: A data set may be horizontally partitioned across multiple physical nodes in a shared-nothing architecture. Where this partitioning is not transparent (i.e. the client must be aware of the partition scheme and work out which node to query explicitly) this is known as sharding. Some systems (e.g. Teradata) do split data across nodes but the location is transparent to the client; the term is not normally used in conjunction with this type of system.

Consistent Hashing: An algorithm used to allocate data to partitions based on the key. It is characterised by even distribution of the hash keys and the ability to elastically expand or reduce the number of buckets efficiently. These attributes make it useful for partitioning data or load across a cluster of nodes where the size can change dynamically with nodes being added or dropping off the cluster (perhaps due to failure).

Multi-Master Replication: A technique that allows writes across multiple nodes in a cluster to be replicated to the other nodes. This technique facilitates scaling by allowing some tables to be partitioned or sharded across servers and others to be synchronised across the cluster. Writes must be replicated to all nodes as opposed to a quorum, so transaction commits are more expensive on a multi-master replicated architecture than on a quorum replicated system.

Non-Blocking Switch: A network switch that uses internal hardware parallelism to achieve throughput that is proportional to the number of ports with no internal bottlenecks. A naive implementation can use a crossbar mechanism, but this has O(N^2) complexity for N ports, limiting it to smaller switches. Larger switches can use more a complex internal topology called a non-blocking minimal spanning switch to achieve linear throughput scaling without needing O(N^2) hardware.

Scaleability doesn't equate to a binary yes or no answer, scaleability should always be expressed as an cost per unit of scale.