kpitsimpl: May 2018

The GOF Software Design Patterns at a Glance

If you are serious about developing software, read this book. It contains timeless concepts you should know well.

I. Creational Patterns
1. Abstract Factory - facilitates the creation of similar objects that share interface
2. Builder - dynamic SQL statements and other dynamically constructed code
3. Factory Method - custom way to construct an object (vs. normal/default class constructor)
4. Prototype - create new instances by copying existing instances
5. Singleton - assurance of 1 and only 1 instance of an entity in an application

II. Structural Patterns
1. Adapter - a means of making an object compatible with another object via a common interface
2. Bridge - the decoupling of the adaptor/interface and the incompatible items (light switch w/fans)
3. Composite - a group of objects treated the same way as a single instance of the same type of object
4. Decorator - adding new functionality to a class via subclass with new methods, props, etc.
5. Façade - simplified interface to a complicated backend
6. Flyweight - reuse of an object instance to create more instances of the same type
7. Proxy - entity serving as a stand-in for the real thing. Often a wrapper for a more complex class.

III. Behavioral Patterns
1. Chain of Responsibility - source command is served by multiple processes with distinct linear roles.
2. Command - encapsulate a request as an object, works great with HttpRequest and user actions
3. Interpreter - dynamic culture and localization settings to display appropriate UI text
4. Iterator - centralize communication among objects that have a common interface (ie. TV remote)
5. Mediator - centralize complex communication and control among related objects
6. Memento - persist and recall object state
7. Observer - a central watcher picks up picks up broadcast messages from everywhere it observes (Twitter)
8. State - persists object state and can react in different ways according to attributes of its current state
9. Strategy - enabling an object to change/choose algorithm(s) at runtime
10. Template Method - great for interfaces; design skeleton w/base needs; allow concretes to implement details
11. Visitor - code execution varies according to visiting object (ie. keyboard keys, mice clicks, WiFi Routers, etc.)

References:

https://softwareengineering.stackexchange.com/questions/234942/differentiating-between-factory-method-and-abstract-factory

https://www.cs.cmu.edu/~charlie/courses/15-214/2016-spring/slides/24%20-%20All%20the%20GoF%20Patterns.pdf

https://quizlet.com/129434321/design-patterns-gof-in-1-sentence-flash-cards/

Encryption in Transit, Encryption at Rest

For example, the typical route of encryption is data being encrypted when a value is stored to disk or sent over a network, then that data is decrypted by an authorized shared key holder application or service, and once the decrypted data has been used, it is invalidated or (if modified) encrypted and stored back on disk or over and across networks to eventually be stored on disk, in user privacy-respecting, encrypted format.

Encryption in Transit
You want to encrypt input, let's say a credit card number, from someone's cell phone. You need the credit card number to be encrypted while en route but ultimately decrypted after the transit is complete and the encrypted card data arrives at the merchant server. Here is what encryption in transit looks like:

Data securely encrypted on the wire going to and fro...

Encryption at Rest

Take for example the case of storing that same credit card data on the merchant's server so that the credit card information can be reused for future purchase payments. In this case, you will want to keep the data encrypted when you write it to disk in order to preserve user privacy.

The data is only decrypted when it is necessary (ie. when a new payment is processed, the encrypted data will be briefly decrypted so that it can be sent to the payment processor service).

Data safely encrypted on disk storage

Encrypted data is only ever decrypted on demand- when something requests it. Encrypted data is secure so long as only intended parties have the shared secret(s) key(s) to decrypt the messages.

Reference: http://blog.fourthbit.com/2014/12/23/traffic-analysis-of-an-ssl-slash-tls-session

OLAP: Facts and Dimensions

OLAP can adequately be described as the storage of redundant copies of transactional records from OLTP and other database sources. This redundancy facilitates quick lookups for complex data analysis because data can be found via more and quicker (SQL execution) paths than normalized OLTP. And OLTP after all- should stick to its namesake and what it does best: processing, auditing, and backing up online transactions, no? Leave data analysis to separate OLAP Warehouses.

OLAP data is typically stored in large blocks of redundant data (the data is organized in ways to optimize and accelerate your data mining computations). Measures are derived from the records in the fact table and dimensions are derived from the dimension tables.

Facts are "measurements, metrics or facts of a particular process" -Wikipedia. Facts are the measurement of the record value: "$34.42 for today's closing stock price", "5,976 oranges sold at the market", "package delivered at 4.57pm", etc.

Dimensions are "lists of related names–known as Members–all of which belong to a similar category in the user’s perception of a data". For example, months and quarters; cities, regions, countries, product line, domain business process metrics, etc.

Dimensions give you a set of things that you can measure and visualize in order to get a better pulse and better overall understanding of the current, past and even potential future (via regression models) shape of your data- which can often alert you to things you might not be able to see (certainly not as quickly or easily) in an OLTP-based Data Analysis model which is often tied to production tables out of necessity or "we don't have time or money to implement a Data Warehouse".

Yes, you definitely do need OLAP/DW capabilities if you think long-term about it. Having intimate knowledge of your operations and how you can change variables to better run your business? Why would any business person (excepting those committing fraud) not want that?

I'd say that implementing a true and effective OLAP environment is worth any project investment and would pay itself over and again in the way of better and more specific/actionable metrics that help administrators of operations make the best, data-backed decisions- some very critical decisions involving millions of dollars and sometimes lives. I'd like a better look at the data before making a multi-million dollar or life/death decision.

SAS, Hadoop, SSAS, with healthy doses of R and/or Python customization?- whatever data solution you choose, my advice is to go with something that has a great track record of providing the kind of solutions that your business and/or your industry require. Use the tool that your ETL developers can utilize effectively and that helps you to best meet the needs of your constituents with whom you will exchange data with (the company, the customer, industry and/or regional regulation compliance).

Open Systems Interconnection (OSI) Model

"The purpose of the OSI reference model is to guide vendors and developers so the digital communication products and software programs they create will interoperate, and to facilitate clear comparisons among communications tools." -Vikram Kumar

Abstract OSI Model

OSI Model cross-references with the protocols that facilitate network communication

The OSI model is a visualized and defined "meaning" of abstract data communication across networks. Some people have argued that the OSI model doesn't meet the needs to explain the abstract ideas in modern TCP:

"To make matters worse, the Internet's evolution, based on TCP/IP, never strictly followed the old OSI model at all. A reasonable person might ask whether people who talk about 'Layer 1' or 'Layer 3' aren’t blowing kisses at an old friend instead of recognizing the relevance of the original OSI model."

Ouch. Well, like it or not the OSI model (10 years on after this 2008 article was published) is still one of the best ways we as IT people have to describe network packet travel from wire to UI screen and back.

Any analogy or model works so long as we understand what happens when we send web requests out to networks (by clicking, by pressing enter, by having certain background services sending random telemetry, polling and other non-user issued web requests).

But standardization is important for globally agreed-upon understandings that serve as a common template and enable "apples to apples" communication on ideas being researched and the building of solutions worldwide, which, at this juncture in the 21st Century, is increasingly common.

It truly is a small world after all and I'd argue we really do neeeed to have standards like:

ISO 35.100 - Open Systems Interconnection
ISO 9001 Quality Management
ISO/IEC 27001 Information Security Management,
ISO 14001 Environmental Management
ISO 31000 - Risk management

...in order to communicate in a common language across the earth and prevent a sprawl of "20 ways to describe the same idea". Let's try to keep these things simple, standardized, and open to easy integrative communication.

References:

https://www.iso.org/standards.html

https://www.lifewire.com/layers-of-the-osi-model-illustrated-818017

https://www.networkworld.com/article/3239677/lan-wan/the-osi-model-explained-how-to-understand-and-remember-the-7-layer-network-model.html

Fire Drills to Ensure Successful Disaster Recovery

TEST. YOUR. DISASTER. RECOVERY. PLAN.

This cannot be said enough. I have heard countless horror stories (and lived 1) of instances where backups were completing "successfully" but no one ever tested the restoration of those backups (which were, in fact, corrupt and not restore-able)- until it was too late. :/

Put in the time up front to make sure your data recovery strategy and high-availability guarantee actually "recover" and make "available" your data- the lifeblood of your organization.

It is important to remember that you can have a Secondary database server that is 100% in sync with the Primary database server, but if your application is not configured to make the switch (or your service host does not make the switch for you)- you will only have 1/2 of things recovered: the data, but not the data availability through the application.

Good References on Disaster Recovery Strategies and what to consider ("sensitivity of data, data loss tolerance, required availability, etc."):

https://www.sqlshack.com/sql-server-disaster-recovery/

https://support.microsoft.com/en-us/help/822400/description-of-disaster-recovery-options-for-microsoft-sql-server

ETL and the Case Against a Single System for All Business Processes

Nearly every business wants to centralize their business data so that they can get an at-a-glance overview of what is going on at any given point in the day. Suppose, for instance, a company has its business data on 5 separate database servers, in various databases therein, across several different tables. And much of the data will need to be slightly modified (transformed) in order to make data types consistent and optimize it for quick and meaningful data analysis.

Assume the database servers are:

CMS, Orders (Dynamics 365 data)
Accounting (Sage 50 data)
Warehouse, Logistics (SAP data)
Field Service, Product Support (SalesForce data)
Call Center (TantaComm data)

Furthermore, the business wants to apply some custom logic to various fields so that certain calculations don't have to be made by the OLAP end users and applications.

ETL is just moving data from a Source to a Destination, often making some changes (transformations) along the way

Well, it all seems so complicated, right? How can all that disparate data be combined and standardized? It's not as complicated as allowing an ERP solution define how your various departments do their job. Some ERP products truly do have the functionality to satisfy 80-85% of the company business requirements. And when that is the case, by all means, companies should go that route.

But for instances where ERP of every business function is not working, you can use products like SSIS, Informatica, and Oracle Data Integrator to easily map ETL data visually and apply code operations to the data using a variety of programming languages.

But of course, there is no "one-size-fits-all 100% of our business needs" ERP solution; different business productivity software work better or worse for different industries, different departmental needs, for all manner of different reasons. Why care about the choice of business tool that your business' management and best minds know and have proven is most effective for their- often very specific- job?

Allowing a business to integrate the "best apps for each particular type of job" is nothing more than a sort of microservices architecture. But if we are working with custom code in these various LOB applications (at least some of the LOB apps will need custom API integrations so that they can talk to each other), with each new block of custom code, comes more decentralized code bases and decentralized test scripts...

All considered, I would argue that trying to put square pegs into round holes a la "ERP-everything" is a much, much, MUCH bigger effort than a well-developed and tested scheduled ETL Data Warehouse refresh process to centralize your company's data.

It's always a balance with software. Use the most effective apps for your organization and departmental requirements, but try to avoid the dependency hell that can come along with a poorly managed distributed application environment.

Here are the ETL steps in order, from the resource link below:

Extract the data from the external data sources such as line-of-business systems, CRM systems, relational databases, web services, and anywhere you have business data you want to summarize.

Transform the data. This includes cleansing the data and converting it to an OLAP-friendly data model. The OLAP-friendly data model traditionally consists of dimension and fact tables in a star or snowflake schema.

Load the data so that it can be quickly accessed by querying tools such as reports, dashboards, etc.

"90% of the requirements now is better than 100% of the requirements never" -Unknown

Reference: https://blogs.msdn.microsoft.com/andreasderuiter/2012/12/05/designing-an-etl-process-with-ssis-two-approaches-to-extracting-and-transforming-data/

SOLID

"These principles, when combined together, make it easy for a programmer to develop software that are easy to maintain and extend." -Robert 'Uncle Bob' Martin

S - Single-Responsibility - objects should only serve one purpose

O - Open-Closed - types should be open to extending capabilities, closed to base changes

L - Liskov Substitution - subtypes of base types should produce objects compatible with base

I - Interface Segregation - client types do not have to depend on unused interface methods

D - Dependency Inversion - rely on abstract templates, not prescriptive/restrictive concretes

Reference: https://scotch.io/bar-talk/s-o-l-i-d-the-first-five-principles-of-object-oriented-design

Levity on the Never-ending Race Towards the Next Big IT Trend

Because, c'mon now- at some level, every one of us can relate to 1 or both of these guys 😆

Reference: https://www.explainxkcd.com/wiki/index.php/198

Log Shipping and Log (and/or Differential Backup) Restoration

Most developers know what a full database backup is. Did you know a differential backup is just a subset of the full backup? But what is the point of transaction log shipping in SQL Server?

A "transaction log" is just a subset of a subset of a full backup containing a set of all T-SQL transactions for a given (usually short span) amount of time; they are typically set at intervals of every few minutes but this varies according to application and data compliance needs.

A commenter in the referenced web link below aptly notes:

"the point of log shipping is to keep the secondary server closer to 100% synced than the full backups (which are much larger and produced much less frequently than log backups)."

Log shipping can be used to serve as a constant "warm backup" (similar to SQL Server database mirroring) of all production transactions for a standby/failover SQL Server* as well as a dedicated Backup SQL Server as depicted here:

Setup if you have a fully-functioning Secondary Production SQL Server

Or, more commonly, the shipped backup logs are sent to a single Backup Server:

Typical production database server scenario, though the steps are a bit out of sequential order, if not (or due to) visual order

In short, a full backup contains the entire database you to recover- up to a certain point in time.

A differential backup is a subset of a backup (generally "full" backups contain days worth of data for bigger organizations)- it would only be prudent to include one or even four differential backups to ensure quick disaster recovery (and full recovery of vital customer data).

Backup logs are chunks of SQL transactions that eventually compose the differential backup(s) and finally the entire full .bak file.

These 3 disaster recovery tools that SQL Server affords DBAs and developers allow you to restore your production data back to the point in time when your server failed:

Apply the most recent full backup of the prod server that went down*
Apply any differential backups since the most recent full backup
Apply all transaction logs up to the point in time of the failure**
Apply all StandBy server transactions logs that may have occurred (if not ReadIOnly)

With all of this said, do be aware that different scenarios and requirements call for different kind of disaster recovery solutions. Consult with the experts and never neglect to TEST YOUR RESTORE PROCEDURE WITH ACTUAL .bak and .trn files- and then test the applications and services connected to those database servers against their restored/Secondary version to ensure you suffer as little downtime as possible in the event of a database emergency.

What a restoration timeline might look like

*Note(!) to the questions on Microsoft Licensing and your StandBy server, should it need to be used for any production activity during your disaster recovery:

"I am not the licensing police, and I am not Microsoft – check with your licensing representative to clarify your exact situation. Generally, you can have one warm standby server. However, the second someone starts using it for reporting, testing, or anything else, you need to license it like any other server."-Brent Ozar

**"If the database is in Standby mode, users can query it – except when a log backup is being restored. You need to decide if a restore job will disconnect users, or if the restore is delayed until after the users are disconnected." -Brent Ozar

References:

https://www.sqlshack.com/sql-server-disaster-recovery/ (SQL Server Replication is another compliment or alternative to the log shipping and backup/restore)

https://www.brentozar.com/archive/2013/03/log-shipping-faq/

Blockchain

Blockchain, apart from all the hoopla heralding it as a "cryptic" means of facilitating monetary transactions, is essentially just a distributed digital ledger (multiple sources of the exact same truth) and distributed digital notaries (transaction verifiers) who, unlike human money handlers and human notaries- do not lie.

Transactions can be scheduled via any desired logic (ie. loan repayment terms, payroll distribution, annuity payouts, securities purchase/sale/options, etc.) and encryption lies at the foundation of the technology- ensuring data integrity and information security.

Blockchain is a type of DLT design. DLT is well-illustrated below:

Much like Git commits, each blockchain transaction is forever tied to an unmodifiable unique SHA-256 hash sum value that ensures the immutability (un-changeability) of each transaction- in brief, if the transaction changes, the original SHA-256 hash sum (unique file signature) that has already been distributed to all nodes of the blockchain- will not match the modified transaction's SHA-256 hash sum, and everyone connected to the blockchain will be able to see that someone is trying to commit a fraudulent transaction modification. So changes to already-posted transactions within a blockchain digital ledger (in theory at least) are virtually impossible.

Reference Video: https://anders.com/blockchain/