Generate Secure Machine Key Section for Web.config via PowerShell

Machine Keys are used in ASP.NET for securing machines that are part of a web farm as well as for sharing encrypted application session and state information.

This PowerShell script (function) can be called (once run and saved to your PS session) via,

"PS C:\: Generate-MachineKey"


With the output from this PS function, you can copy and paste to your web.config ie:
 <configuration>  
  <system.web>  
   <machineKey ... />  
  </system.web>  
 </configuration>  

Generate-MachineKey function definition/PS script
 # Generates a <machineKey> element that can be copied + pasted into a Web.config file.  
 function Generate-MachineKey {  
  [CmdletBinding()]  
  param (  
   [ValidateSet("AES", "DES", "3DES")]  
   [string]$decryptionAlgorithm = 'AES',  
   [ValidateSet("MD5", "SHA1", "HMACSHA256", "HMACSHA384", "HMACSHA512")]  
   [string]$validationAlgorithm = 'HMACSHA256'  
  )  
  process {  
   function BinaryToHex {  
     [CmdLetBinding()]  
     param($bytes)  
     process {  
       $builder = new-object System.Text.StringBuilder  
       foreach ($b in $bytes) {  
        $builder = $builder.AppendFormat([System.Globalization.CultureInfo]::InvariantCulture, "{0:X2}", $b)  
       }  
       $builder  
     }  
   }  
   switch ($decryptionAlgorithm) {  
    "AES" { $decryptionObject = new-object System.Security.Cryptography.AesCryptoServiceProvider }  
    "DES" { $decryptionObject = new-object System.Security.Cryptography.DESCryptoServiceProvider }  
    "3DES" { $decryptionObject = new-object System.Security.Cryptography.TripleDESCryptoServiceProvider }  
   }  
   $decryptionObject.GenerateKey()  
   $decryptionKey = BinaryToHex($decryptionObject.Key)  
   $decryptionObject.Dispose()  
   switch ($validationAlgorithm) {  
    "MD5" { $validationObject = new-object System.Security.Cryptography.HMACMD5 }  
    "SHA1" { $validationObject = new-object System.Security.Cryptography.HMACSHA1 }  
    "HMACSHA256" { $validationObject = new-object System.Security.Cryptography.HMACSHA256 }  
    "HMACSHA385" { $validationObject = new-object System.Security.Cryptography.HMACSHA384 }  
    "HMACSHA512" { $validationObject = new-object System.Security.Cryptography.HMACSHA512 }  
   }  
   $validationKey = BinaryToHex($validationObject.Key)  
   $validationObject.Dispose()  
   [string]::Format([System.Globalization.CultureInfo]::InvariantCulture,  
    "<machineKey decryption=`"{0}`" decryptionKey=`"{1}`" validation=`"{2}`" validationKey=`"{3}`" />",  
    $decryptionAlgorithm.ToUpperInvariant(), $decryptionKey,  
    $validationAlgorithm.ToUpperInvariant(), $validationKey)  
  }  
 }  

Accessing SQL Server Data in R

Importing SQL Server data into R for analysis is pretty straightforward and simple. You will obviously need R installed on your machine. The following R code will connect to your SQL Server database (using R Studio):

 library(RODBC)  
 dbconnection <- odbcDriverConnect('driver={SQL Server};server=.;database=CLARO;trusted_connection=true')  
 initdata <- sqlQuery(dbconnection,paste('SELECT * FROM [CLARO].[dbo].[Fielding];')) 


SELECT data from a SQL Server database output in R Studio


Accessing SQL Server Data in Python

So you want to access Microsoft SQL Server from your Python script(s)?

After weeding out some long-abandoned and/or nonworking solutions, I discovered a very simple Python ODBC driver that works with virtually all SQL Servers since MSSQL 2005 called "pyodbc". 

First, you will need to install this MSSQL ODBC (13.1 or 17 should work) component on your machine in addition to installing the pyodbc driver.

Next, get the pyodbc module for Python by running this from Windows command prompt:

pip install pyodbc

Then open up a python shell using 'py' or 'python' and enter the following after editing configuration values to match your development environment:

 import pyodbc  
 cnxn = pyodbc.connect('DRIVER={ODBC Driver 17 for SQL Server};SERVER=localhost;DATABASE=WideWorldImporters;UID=DemoUser;PWD=123Password')  
 cursor = cnxn.cursor()  
 #Sample of a simple SELECT  
 cursor.execute("SELECT TOP (100) Comments, count(*) FROM WideWorldImporters.Sales.Orders GROUP BY Comments")  
 row = cursor.fetchone()   
 while row:   
   print(row[0] + ': ' + str(row[1]))  
   row = cursor.fetchone()  

Running this code will result in the below if you have configured everything correctly (note this example makes use of the Microsoft SQL Server demo WorldWideImporters database):


Reference: https://docs.microsoft.com/en-us/sql/connect/python/pyodbc/step-3-proof-of-concept-connecting-to-sql-using-pyodbc?view=sql-server-2017

.udl for DbConnection Check

This is a useful method to quickly check SQL credentials and/or RDBMS connectivity if working on a Windows OS. Just create a file in any editor (ie. Notepad) and save it with .udl extension which makes it a Microsoft Data Link file type. Then, right-click and inspect file Properties >> "Connection" tab.

Credit to a former colleague of mine (thanks Gene David!) who showed me how to use this simple but very useful trick.



Short Selling

Broker borrows a share, sells the share high, repurchases share at lower price ($) and returns it.


Short selling stock is the practice by which a broker borrows stock with the hope that the price of that stock will fall so that he or she can sell at a high price, (re)purchase at a lower price, and pocket the difference.

Hypothetically, let's say a trader named Joe firmly believed that Apple, Inc. was about to experience a large drop in share price. To short a single share of Apple stock, Joe would do the following:

1). Borrow a share of APPL from his portfolio, a client portfolio, or a fellow broker
2). Sell the share at the highest a price they can find before a drop (say 1 share of AAPL at current $157.76)
3). Wait for the price to fall (say APPL falls to $102.76), then purchase one share at this lower price
4). Subtract the higher price from the lower price (less fees) and return the borrowed share. 
Joe earns a cool $53 bucks from this scheme as he sold at $157.76 and bought back for just $102.76. After fees of $2.00 this is $157.76 - $102.76 -$2.00 == $53.00.

While the idea of selling something short of true value is often associated with the nefarious case of a stock "short" like this, oftentimes it is a necessity. The market always needs people on both the long end (owners/buyers) and the short end (renters/sellers) for it to work properly.

This is why banks who are on the hook with a property that they cannot sell will ultimately agree to a "short sale" (selling the home for below its fair market value) to recoup at least some of their losses.

A combination of consumer preferences and financial factors determine whether to go long or short on any kind of investment or large financial transaction.



Short selling doesn't always work in the sellers favor

Refactoring Made Obvious

Refactoring (as a term if not as a practice) gets thrown around quite a bit. Sometimes really necessary refactoring doesn't get the priority it deserves because it is hard to quantify or even visualize/sense the result of a good refactoring (it should be relatively transparent in experience to the original, any differences should be optimization or enhancement- without losing of the original functionality along the way). We can take the case of a simple JavaScript animation, for example.

Simple animating of "+" dropping across the browser screen

Long ago, I used a mobile app that had a neat UI animation feature I really liked, but it took me a while to track down just how to accomplish it. I found some good starting points on SO, and began implementing a draft on JSFiddle.net.

The animation behavior I went about creating is simply a delay of an HTML element falling from the screen (via the "topToBottom" variable you'll see below- which is just the browser screen height property) . In a cascading and sequential set of delays, each falling element is pushed an increasing distance from the left of the screen so that the elements can fall independently (otherwise you would see just one column for all of the falling +'s in the end result).

In the following 4 steps, I am going to present a very basic refactoring scenario- going from the code template references I found, to the final refactored code (css, html and .js).

(1) First I found a fiddle via the SO question reference below: http://jsfiddle.net/reWwx/4/

(2) I then changed the code to create my own draft on JSFiddle.net: http://jsfiddle.net/reWwx/539/

(3) Next, I created an .htm file with the CSS styles and JavaScript inline (not quite what we'd want to check into source control...):
 <html>  
 <head>  
 <style>  
 body {height: 600px; background-color: #999}  
 #line-3 {  
   position:absolute;  
   width:100%;  
   left:20px;  
   top:0px;  
 }  
 #line-4 {  
   position:absolute;  
   width:100%;  
   left:30px;  
   top:0px;  
 }  
 #line-5 {  
   position:absolute;  
   width:100%;  
   left:40px;  
   top:0px;  
 }  
 #line-6 {  
   position:absolute;  
   width:100%;  
   left:55px;  
   top:0px;  
 }  
 #line-7 {  
   position:absolute;  
   width:100%;  
   left:70px;  
   top:0px;  
 }  
 #line-8 {  
   position:absolute;  
   width:100%;  
   left:85px;  
   top:0px;  
 }  
 #line-9 {  
   position:absolute;  
   width:100%;  
   left:100px;  
   top:0px;  
 }  
 #line-10 {  
   position:absolute;  
   width:100%;  
   left:115px;  
   top:0px;  
 }  
 #line-11 {  
   position:absolute;  
   width:100%;  
   left:130px;  
   top:0px;  
 }  
 #line-12 {  
   position:absolute;  
   width:100%;  
   left:145px;  
   top:0px;  
 }  
 #line-13 {  
   position:absolute;  
   width:100%;  
   left:160px;  
   top:0px;  
 }  
 #line-14 {  
   position:absolute;  
   width:100%;  
   left:175px;  
   top:0px;  
 }  
 #line-15 {  
   position:absolute;  
   width:100%;  
   left:195px;  
   top:0px;  
 }  
 #line-16 {  
   position:absolute;  
   width:100%;  
   left:210px;  
   top:0px;  
 }  
 </style>  
 <script>  
 $(document).ready(function(){  
   var bodyHeight = $('body').height();  
   var footerOffsetTop = $('#line-3').offset().top;  
   var topToBottom = bodyHeight -footerOffsetTop;  
  $('#line-3').css({top:'auto',bottom:topToBottom});  
  $("#line-3").delay(100).animate({  
   bottom: '100px',  
   }, 2200);   
  $('#line-4').css({top:'auto',bottom:topToBottom});  
  $("#line-4").delay(108).animate({  
   bottom: '100px',  
   }, 2200);   
  $('#line-5').css({top:'auto',bottom:topToBottom});  
  $("#line-5").delay(145).animate({  
   bottom: '100px',  
   }, 2200);   
  $('#line-6').css({top:'auto',bottom:topToBottom});  
  $("#line-6").delay(119).animate({  
   bottom: '100px',  
   }, 2200);   
  $('#line-7').css({top:'auto',bottom:topToBottom});  
  $("#line-7").delay(115).animate({  
   bottom: '100px',  
   }, 2200);   
    $('#line-8').css({top:'auto',bottom:topToBottom});  
  $("#line-8").delay(176).animate({  
   bottom: '100px',  
   }, 2100);   
    $('#line-9').css({top:'auto',bottom:topToBottom});  
  $("#line-9").delay(13).animate({  
   bottom: '100px',  
   }, 2200);   
    $('#line-10').css({top:'auto',bottom:topToBottom});  
  $("#line-10").delay(12).animate({  
   bottom: '100px',  
   }, 2200);   
    $('#line-11').css({top:'auto',bottom:topToBottom});  
  $("#line-11").delay(11).animate({  
   bottom: '100px',  
   }, 2000);   
    $('#line-12').css({top:'auto',bottom:topToBottom});  
  $("#line-12").delay(10).animate({  
   bottom: '100px',  
   }, 2100);   
    $('#line-13').css({top:'auto',bottom:topToBottom});  
  $("#line-13").delay(11).animate({  
   bottom: '100px',  
   }, 600);   
    $('#line-14').css({top:'auto',bottom:topToBottom});  
  $("#line-14").delay(14).animate({  
   bottom: '100px',  
   }, 700);   
      $('#line-15').css({top:'auto',bottom:topToBottom});  
  $("#line-15").delay(14).animate({  
   bottom: '100px',  
   }, 800);   
      $('#line-16').css({top:'auto',bottom:topToBottom});  
  $("#line-16").delay(24).animate({  
   bottom: '100px',  
   }, 900);   
 })  
 </script>  
 </head>  
 <body>  
 <div id="line-3">+</div>  
 <div id="line-4">+</div>  
 <div id="line-5">+</div>  
 <div id="line-6">+</div>  
 <div id="line-7">+</div>  
 <div id="line-8">+</div>  
 <div id="line-9">+</div>  
 <div id="line-10">+</div>  
 <div id="line-11">+</div>  
 <div id="line-12">+</div>  
 <div id="line-13">+</div>  
 <div id="line-14">+</div>  
 <div id="line-15">+</div>  
 <div id="line-16">+</div>  
 </body>    
 </html>  
(4) And lastly I identified all of the repeating parts and made them dynamic in JavaScript, using the jQuery library to shorten much of the .js behavior:
  <html>   
  <head>   
  <style>   
  body {height: 600px; background-color: #000000; color:lime;}   
  div {   
   position:absolute;   
   top:0px;   
    width:100%;   
  }   
  </style>   
  <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>   
  <script>   
  $(document).ready(function(){   
   var base = $('#base');  
   var topToBottom = $('body').height();    
   for(i=0; i<83; i++){   
    base.append('<div id=\"line-'+i+'\" style=\"left:'+ Math.abs(i*10) +'\px">+</div>')   
    $("#line-" + i).css({top:'auto', bottom:topToBottom}).delay(100*i).animate({bottom: '100px'}, (1000));    
   }   
  })   
  </script>   
  </head>   
  <body id="base">   
  </body>    
  </html>   

The final result contains the same behavior of the draft but it eliminates repetition by dynamically generating the HTML and dynamically attaching the falling action (which is really just coordinated position and visibility property changes behind the scenes of .animate()). Eliminating duplication, standardizing for better readability, reorganization for better clarity of code purpose and finding patterns (or finding different patterns that are a better match for the task) are the key concepts in refactoring.

Try it yourself by copying the code above, saving to an .htm file and opening the file in a web browser.

Final JSFiddle result: https://jsfiddle.net/radagast/6uzypc80/5

Larger font is always fun: https://twickrtape.azurewebsites.net/Home/About

Reference: https://stackoverflow.com/questions/8518400/jquery-animate-from-css-top-to-bottom

JavaScript for Progress on Scrolling

This topic is a prime example of why I write this web log. I've seen this functionality on countless web pages and mobile apps, but for some reason it is not well explained in most of the areas you will likely first wind up when searching for instructions on how to do this (if you wound up here first great, yay me).

Potential applications of progress on scroll might be a "Terms and Conditions" view, a code walk-through, etc.

The key code is all in $(window).scroll(function(). You have these 3 main components:

  • Document Height: $(document).height() === height of "viewable" window aka viewport
  • Window Height: $(window).height() === height of document being rendered
  • Scroll Top: $(window).scrollTop() === number of pixels content is scrolled

With these values you can set the width of the progress bar (the div with the class "scroll-progress-container"). The width is a simple calculation: scrollTop / (docHeight - windowHeight) * 100.

So you can attach this logic to an anonymous function within the browser's window scroll event a la: $(window).scroll(function() { ...

And then simply assign the result of  "$(window).scrollTop() / (docHeight - windowHeight) * 100;" to the width property of your <div class="scroll-progress">.

That's it.

Code:

 <!doctype html>  
 <head>  
 <title></title>  
 <style>  
 .header-container {  
  width: 100%;  
  height: 60px;  
  background-color: white;  
  position: fixed;  
  z-index: 10;  
  top: 0;  
  left: 0;  
 }  
 .header {  
  padding-left: 10px;  
 }  
 h1 {  
  margin-top: 15px;  
 }  
 .scroll-progress-container {  
  width: 100%;  
  height: 5px;  
  background-color: white;  
  top: 55px;  
  position: absolute;  
 }  
 .scroll-progress {  
  width: 0px;  
  height: 5px;  
  background-color: purple;  
 }  
 .filler-text {  
  width: 60%;  
  margin-top: 80px;  
  margin-left: 50px;  
  position: absolute;  
 }  
 </style>  
 <script src="https://code.jquery.com/jquery-1.11.2.min.js"></script>  
 <script>  
  $(document).ready(function() {  
    var docHeight = $(document).height(),  
    windowHeight = $(window).height(),  
    scrollPercent;  
     $(window).scroll(function() {  
       scrollPercent = $(window).scrollTop() / (docHeight - windowHeight) * 100;  
       $('.scroll-progress').width(scrollPercent + '%');  
     });  
 });  
 </script>  
 </head>  
 <body>  
 <div class="header-container">  
 <div class="header">  
 <h1>Progress Scroll Bar Example</h1>  
 </div>  
 <div class="scroll-progress-container">  
 <div class="scroll-progress"></div>  
 </div>  
 </div>  
 Enter lots and lots of text here...  
 </body>  
 </html>  


References

https://www.veracode.com/blog/managing-appsec/building-scroll-progress-bar-javascript-and-jquery

https://stackoverflow.com/questions/14035819/window-height-vs-document-height

DNS Rebinding and the Fallacy of "Walled Gardens"

Attacking Private Networks from the Internet with DNS Rebinding

The well-written research article above by Brannon Dorsey is a must-read for any developer of home-integrated devices, and perhaps, all developers, even those who develop products that run within highly secure networks that, via an authenticated user- (as was done with Stuxnet)- can inadvertently intercept and execute malicious code.

Don't let script kiddies mess with your code via DNS

Essentially Mr. Dorsey discovered that smart home gadgets that are supposed to operate securely in private networks can be exploited from the outside by simply embedding scripts containing network hi-jack exploits in links that can make requests back to the client browser that appear- via dynamic IP-to-same-origin-hostname switching* (ie. "DNS Spoofing") done through a malicious DNS server -to be from a trusted, same-origin source.

ie.

Malicious hostname: exploit.net
Malicious ip address: 59.33.12.9

Victim recent request hostname: somebank.com
Victim recent request ip address: 122.76.21.19

<<Very brief DNS Hijack via a malicious DNS server>>

Malicious hostname: exploit.net
Malicious ip address as far as victim browser client is aware: 122.76.21.19

......See the problem?

He goes on to explain how entire protocols like UPnP "are built around the idea that devices on the same network can trust each other".

Remember: what appears to the browser client to be a "same-origin" request is not always actually a same-origin request. Make sure that you change the default credentials on your network router(s) and as the article above insists:

"We need developers to write software that treats local private networks as if they were hostile public networks. The idea that the local network is a safe haven is a fallacy. If we continue to believe it people are going to get hurt."

Good to remember: protocol://host:port/path?query

*DNS cache poisoning, also known as DNS spoofing, is a type of attack that exploits vulnerabilities in the domain name system (DNS) to divert Internet traffic away from legitimate servers and towards fake ones.

Visualize Hashing and Salt as Part of Password Encryption Process

The image below is a simplified and easy-to-understand illustration of how hashing and salting work. The main takeaway from this post- multiple users can have the same password, but will all have different salt values, thus making their hash result value different, and when you authenticate, you authenticate by the hash result value of your passwords, which is virtually always going to be unique for each user record:

Simple, no?

Even in the case of 2 users having the same hash result, the usernames will/should not be the same, so you still have distinct accounts, because UserID is also checked in the authentication process.

Companies increasingly (and for good data privacy reasons) do not even store the clear text textbox value you enter when you sign up for and then log into Fb, Google, Amazon, etc- they check your entered password's hash result against the hash result they have for your user/account record either from when you registered or last changed your password.

Good answer to the question you may come across, "what is the difference between salt and an IV (initialization vector)?" (TL;DR: not all IV's are salt, but salt is a kind of IV): https://security.stackexchange.com/questions/6058/is-real-salt-the-same-as-initialization-vectors


Quality Control

You should know at least the surface topics surrounding TQM (Total Quality Management) because nearly all modern businesses practice TQM strategies and tactics to reduce costs and ensure top quality.

But first, check out this old video clip of America discovering something that ironically, an American (W. Edwards Deming) exported to Japan with great success years before:

1980 NBC News Report: "If Japan Can, Why Can't We?"

So big-Q "Quality" became a bit hit and has been embedded in process management throughout the globe ever since.

I think he has a point here.

Here are some Quality buzz words that surely you've heard before:

ASQ - American Society for Quality

"Black Belt" - Ooo. Ahh. It does mean something. It means a person has passed a series of very difficult exams on statistics and statistical process control for quality based on the quantitative technics and measures originated in Japan by W. Edwards Deming.

ISO 9001 - the International standard of a Quality Management System that is used to certify that business processes follow standard process and product guidelines.

Kaizen - a long-term approach to work that systematically seeks to achieve small, incremental changes in processes in order to improve efficiency and quality.

Kanban -  a visual system for managing work as it moves through a process.

Lean - a synonym for continuous improvement through balanced efficiency gains.

Example of statistical process control using UCL and LCL boundaries and a process (Fall Rate) improving.

LCL*  - Lower Control Limit - The negative value beyond which a process is statistically unstable.

MAIC - Measure, Analyze, Improve, Control.

Service Level Agreements (SLA) - A contract between a service provider and end user that defines the expected level of service to the end user.

UCL* - Upper Control Limit - The positive value beyond which a process is statistically unstable.

Uptime - Uptime is a measure of the time a service is working and available and opposite of Downtime.

Six Sigma - a statistical approach to process improvement and quality control; sometimes defined as +/-3 three deviations for the mean ("6"), sometimes as +/-6 deviations from mean.

The table above gives you an idea of realistic process improvement numbers (66,800 == a lot of defective items)


History and W. Edwards Deming
Quality Management is a permanent organizational approach to continuous process improvement. It was successfully applied by W. Edwards Deming in post-WWII Japan. Deming's work began in August 1950 at the Hakone Convention Center in Tokyo, when Deming delivered a speech on what he called "Statistical Product Quality Administration".

He is credited with helping hasten Japanese recovery after the war and then later helping American companies embrace TQM and realize significant efficiency and quality gains.


Deming's 14 Points for Total Quality Management

*Measures such as standard deviation and other distribution-based statistics determine the LCL and UCL for a process (any process- temperature of a factory floor, time to assemble a component, download/upload speed, defects per million, etc.).

References:

http://asq.org/learn-about-quality/total-quality-management/overview/deming-points.html

https://www.quora.com/How-did-W-Edwards-Deming-influence-Japanese-manufacturing

The GOF Software Design Patterns at a Glance

If you are serious about developing software, read this book. It contains timeless concepts you should know well.

I. Creational Patterns
1. Abstract Factory - facilitates the creation of similar objects that share interface
2. Builder - dynamic SQL statements and other dynamically constructed code
3. Factory Method - custom way to construct an object (vs. normal/default class constructor)
4. Prototype create new instances by copying existing instances
5. Singleton - assurance of 1 and only 1 instance of an entity in an application

II. Structural Patterns
1. Adapter - a means of making an object compatible with another object via a common interface
2. Bridge - the decoupling of the adaptor/interface and the incompatible items (light switch w/fans)
3. Composite a group of objects treated the same way as a single instance of the same type of object
4. Decorator - adding new functionality to a class via subclass with new methods, props, etc.
5. Façade - simplified interface to a complicated backend
6. Flyweight - reuse of an object instance to create more instances of the same type
7. Proxy - entity serving as a stand-in for the real thing. Often a wrapper for a more complex class.

III. Behavioral Patterns
1. Chain of Responsibility - source command is served by multiple processes with distinct linear roles.
2. Command - encapsulate a request as an object, works great with HttpRequest and user actions
3. Interpreter - dynamic culture and localization settings to display appropriate UI text
4. Iterator - centralize communication among objects that have a common interface (ie. TV remote)
5. Mediator - centralize complex communication and control among related objects
6. Memento - persist and recall object state
7. Observer - a central watcher picks up picks up broadcast messages from everywhere it observes (Twitter)
8. State - persists object state and can react in different ways according to attributes of its current state
9. Strategy - enabling an object to change/choose algorithm(s) at runtime
10. Template Method - great for interfaces; design skeleton w/base needs; allow concretes to implement details
11. Visitor - code execution varies according to visiting object (ie. keyboard keys, mice clicks, WiFi Routers, etc.)


References:

https://softwareengineering.stackexchange.com/questions/234942/differentiating-between-factory-method-and-abstract-factory

https://www.cs.cmu.edu/~charlie/courses/15-214/2016-spring/slides/24%20-%20All%20the%20GoF%20Patterns.pdf

https://quizlet.com/129434321/design-patterns-gof-in-1-sentence-flash-cards/

Encryption in Transit, Encryption at Rest

For example, the typical route of encryption is data being encrypted when a value is stored to disk or sent over a network, then that data is decrypted by an authorized shared key holder application or service, and once the decrypted data has been used, it is invalidated or (if modified) encrypted and stored back on disk or over and across networks to eventually be stored on disk, in user privacy-respecting, encrypted format.

Encryption in Transit
You want to encrypt input, let's say a credit card number, from someone's cell phone. You need the credit card number to be encrypted while en route but ultimately decrypted after the transit is complete and the encrypted card data arrives at the merchant server. Here is what encryption in transit looks like:

Data securely encrypted on the wire going to and fro...
Encryption at Rest
Take for example the case of storing that same credit card data on the merchant's server so that the credit card information can be reused for future purchase payments. In this case, you will want to keep the data encrypted when you write it to disk in order to preserve user privacy.

The data is only decrypted when it is necessary (ie. when a new payment is processed, the encrypted data will be briefly decrypted so that it can be sent to the payment processor service).

Data safely encrypted on disk storage

Encrypted data is only ever decrypted on demand- when something requests it. Encrypted data is secure so long as only intended parties have the shared secret(s) key(s) to decrypt the messages.


Reference: http://blog.fourthbit.com/2014/12/23/traffic-analysis-of-an-ssl-slash-tls-session

OLAP: Facts and Dimensions

OLAP can adequately be described as the storage of redundant copies of transactional records from OLTP and other database sources. This redundancy facilitates quick lookups for complex data analysis because data can be found via more and quicker (SQL execution) paths than normalized OLTP. And OLTP after all- should stick to its namesake and what it does best: processing, auditing, and backing up online transactions, no? Leave data analysis to separate OLAP Warehouses.

OLAP data is typically stored in large blocks of redundant data (the data is organized in ways to optimize and accelerate your data mining computations). Measures are derived from the records in the fact table and dimensions are derived from the dimension tables.


Facts are "measurements, metrics or facts of a particular process" -Wikipedia. Facts are the measurement of the record value: "$34.42 for today's closing stock price", "5,976 oranges sold at the market", "package delivered at 4.57pm", etc.

Dimensions are "lists of related names–known as Members–all of which belong to a similar category in the user’s perception of a data". For example, months and quarters; cities, regions, countries, product line, domain business process metrics, etc.

Dimensions give you a set of things that you can measure and visualize in order to get a better pulse and better overall understanding of the current, past and even potential future (via regression models) shape of your data- which can often alert you to things you might not be able to see (certainly not as quickly or easily) in an OLTP-based Data Analysis model which is often tied to production tables out of necessity or "we don't have time or money to implement a Data Warehouse".


Yes, you definitely do need OLAP/DW capabilities if you think long-term about it. Having intimate knowledge of your operations and how you can change variables to better run your business? Why would any business person (excepting those committing fraud) not want that?

I'd say that implementing a true and effective OLAP environment is worth any project investment and would pay itself over and again in the way of better and more specific/actionable metrics that help administrators of operations make the best, data-backed decisions- some very critical decisions involving millions of dollars and sometimes lives. I'd like a better look at the data before making a multi-million dollar or life/death decision.


SAS, Hadoop, SSAS, with healthy doses of R and/or Python customization?- whatever data solution you choose, my advice is to go with something that has a great track record of providing the kind of solutions that your business and/or your industry require. Use the tool that your ETL developers can utilize effectively and that helps you to best meet the needs of your constituents with whom you will exchange data with (the company, the customer, industry and/or regional regulation compliance).