kpitsimpl

Windowing in SQL

When something is already very clearly explained in a concise manner, there is no need for me to take a stab at explaining it. While researching the concept of windowing (running counts, SUM ... OVER ... AS), I identified the following quote which sums up this useful SQL feature very neatly:

" A window function performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function. But unlike regular aggregate functions, use of a window function does not cause rows to become grouped into a single output row — the rows retain their separate identities. Behind the scenes, the window function is able to access more than just the current row of the query result. "

Running count request on some of your data? No problem- just use something like this, repurposed for your own requirements:

SELECT xact_amt,
SUM(xact_amt) OVER (ORDER BY xact_datetime) AS running_total
FROM SOME_DB.dbo.SOME_TABLE

References:

https://community.modeanalytics.com/sql/tutorial/sql-window-functions/

https://www.postgresql.org/docs/9.1/static/tutorial-window.html

Patterns Simpl: Singleton

It is imperative to ensure atomicity (indivisible-ness) for certain "master" objects in your applications.

Singleton ensures the existence of 1 and only 1 object instance (or version/representation) of an application entity- at all times.

Some examples of Singleton:

If your program somehow winds up with 2 globally-scoped objects of a class that handles bank deposits, let's hope our account gets updated by the one with a larger dollar amount!

In Java, there is a single instance of java.lang.Runtime. If there were more than 1 language instance to interface with the Java application environment, you can see how problems would unfold.

In .NET applications, an application .config file serves as a single source of configuration values.

Log Managers should have 1 and only 1 instance, lest our log files become far too complex to sift through when we are tracking down an application issue.

Reference: https://stackoverflow.com/questions/3192095/where-exactly-the-singleton-pattern-is-used-in-real-application

C# switch case, for/foreach loops and enum

Often in the earlier years of my software development career (esp. after working solely on SQL code for a long stretch of time), I would forget the exact syntax of some commonly used C# logic routines.

I've long since memorized the exact syntax for all of these, but I'd like to help someone else out there who may need a reminder in the future.

So here is a one-stop handy reference for these 3 commonly used C# constructs:

using System;
// enum
public enum Color { Red, Green, Blue }

public class Example
{   // switch
   public static void Main()
   {
      Color c = (Color) (new Random()).Next(0, 3);
      switch (c)
      {
         case Color.Red:
            Console.WriteLine("The color is red");
            break;
         case Color.Green:
            Console.WriteLine("The color is green");
            break;
         case Color.Blue:
            Console.WriteLine("The color is blue");   
            break;
         default:
            Console.WriteLine("The color is unknown.");
            break;   
      }
   }
}

// for loop
for (int i = 1; i <= 5; i++)
{
 Console.WriteLine(i);
}

 // foreach loop
int[] fibarray = new int[] { 0, 1, 1, 2, 3, 5, 8, 13 };
foreach (int element in fibarray)
{
    System.Console.WriteLine(element);
}

Sql Connection String for .NET

Standard Security:

"Server=myServerAddress;Database=myDataBase;User Id=myUsername;
Password=myPassword;"

References:

https://www.connectionstrings.com/sql-server-2016

https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/connection-string-syntax

(on unit vs. integration tests) So maybe this is the distinction I find useful. There are tests where a failure points directly to the problem.

These tests are tests either of well-understood abstractions or using only well-understood abstractions.

Then there are tests where a failure indicates a genuine problem, but where oh where? - Kent Beck

Patterns Simpl: Dependency Injection

"In software engineering, dependency injection is a technique whereby one object supplies the dependencies of another object. A dependency is an object that can be used (a service). An injection is the passing of a dependency to a dependent object (a client) that would use it."

A better quote for the initiated to DI/IoC would be:

"Dependency Injection was originally called Inversion of Control (IoC) because the normal control sequence would be the object finds the objects it depends on by itself and then calls them. Here, this is reversed: The dependencies are handed to the object when it's created. This also illustrates the Hollywood Principle at work: Don't call around for your dependencies, we'll give them to you when we need you." -javaworld.com

In .NET, .resx resource files that contain different inputs based on the culture settings of the client is an example of DI.

Another simple and clear example of Dependency Injection in action is in ASP.NET Core Authentication.

The built-in Authentication Controller and the Startup.cs Configure() method are designed to work with an interface and not one specific/concrete class of authentication provider. It uses IAuthenticationSchemeProvider which provides a generic and flexible structure that leaves the specific type of authentication up to the developer(s) (Google, Microsoft, Twitter, GitHub, Instagram, Facebook, etc.).

These different authentication types/providers are what the IAuthenticationSchemeProvider depends on to handle the authentication process for ASP.NET Core applications.

The developer injects the dependency(s) of his or her choice.

A common form of Dependency Injection is Localization of display text in applications

Loose coupling is key for interoperability with other apps and services, if your dependencies are more generic, your system is more extensible

Reference: https://codeaddiction.net/articles/10/dependency-injection-in-c---a-simple-introduction

Find all instances of a string within a string in T-SQL

CREATE FUNCTION dbo.FindPatternLocation
(
    @string NVARCHAR(MAX),
    @term   NVARCHAR(255)
)
RETURNS TABLE
AS
    RETURN 
    (
      SELECT pos = Number - LEN(@term) 
      FROM (SELECT Number, Item = LTRIM(RTRIM(SUBSTRING(@string, Number, 
      CHARINDEX(@term, @string + @term, Number) - Number)))
      FROM (SELECT ROW_NUMBER() OVER (ORDER BY [object_id])
      FROM sys.all_objects) AS n(Number)
      WHERE Number > 1 AND Number <= CONVERT(INT, LEN(@string)+1)
      AND SUBSTRING(@term + @string, Number, LEN(@term)) = @term
    ) AS y);

Simply use the function above.

If you want to understand technically how this works, see the reference below.

Essentially, this function dynamically creates a table (CTE) of the term being searched for with Number being the Row_Number of each charindex within the string.

This CTE is then searched to elicit the starting positions of each location where string term matches ([WHERE] SUBSTRING(@term + @string, Number, LEN(@term)) = @term).

Usage:

The part: "SELECT ROW_NUMBER() OVER (ORDER BY [object_id].." is there to provide space to establish the CTE with which to store search string indexes. If you need more space to traverse over (your search string is > 2k characters) you can use sys.all_columns instead of sys.all_objects or a CROSS JOIN for tables with greater than 8k records.

Reference: https://sqlperformance.com/2013/01/t-sql-queries/generate-a-set-1

All credits: Aaron Bertrand. He is a T-SQL genius. If it can be done in T-SQL, Aaron has done it, benchmarked it and probably blogged about it: https://sqlperformance.com/author/abertrand

Visual Studio Config Transforms

When you are working on software that needs to target different resources for different environments (ie. local, DEV, STAGE, PRODUCTION), you can easily manage your VS project environment through the use of .config transformation files.

Here is your default web.config:

Simply make an alternate version (with the same name) of the element you wish to change in your various configuration transformations (web.DEBUG.config, web.Development.config, etc.). In this example, we are setting up a transform to change the connection string that has the name, "applicationConnStr":

NOTE: The same configuration element (the same name, specifically) you want to transform must be in the base web.config file.

By default, the transformation we have set up will only be applied when the project is published. In order to get the transformations working with each build, you must edit the bottom of the *.csproj file with the following Target:

<TransformXml

Source="Web.Base.config"

Transform="Web.$(Configuration).config"

Destination="Web.config" />

</Target>

And be sure to uncomment the code, it is commented out by default. Now select a different build configuration, build your project, and you should see the "applicationConnStr" transformed to match the element from the web.config.[Debug||Release||etc.] that you selected to Build:

To add new web.config transform files, just right click on the web.config file and select "Add Config Transformation".

To associate your transformations with different builds, under the Build Dropdown, select "Configuration Manager..." and configure as you like:

Setting up build configurations that point to different web.configs (actually web.config "transforms") allows you to debug and run code according to your needs (ie. show DEMO at a meeting, Debug Development to uncover a bug, point to local for ongoing development, etc.).

ASP.NET Core has a very different configuration setup that uses JSON objects and more configurable things in .NET Core Startup.cs; this reference is for .NET Standard.

The key in all of this is the name of the element you are matching and then replacing. It just has to match in both places and all will be transformed on your application's web.config. In this example, we matched and replaced the value for a config element with the name, "applicationConnStr".

In a real-world scenario, there are many things that we could dynamically replace through config transforms: database connection strings, image servers, mail servers, file shares, printer locations, encryption keys, etc.).

Reference: https://docs.microsoft.com/en-us/aspnet/web-forms/overview/deployment/visual-studio-web-deployment/web-config-transformations

Customize SSRS Stylesheet

This is an easy thing to do, especially if you are well-versed in CSS/DOM designing.

1). Go to this location and open up HtmlViewer.css in your favorite text editor:

2). Edit/override styles however you see fit. You can change almost all of the styles on the SSRS toolbar:

(before)

(change made)

(after)

3). If you want to point to different stylesheets for different report executions (ie. if you need security and want to hide the Save or Print button from some users), you can easily do that by adding the filename of your own custom stylesheet to the rc:Stylesheet parameter in the SSRS Request URI a la:

Other things like client-side scripting can be done to extend the Report Viewer UI further, but doing that sort of "functionality" integration is not well documented. If you want complete control over your Reporting Services app UI, take a look at the Report Server web service: https://docs.microsoft.com/en-us/sql/reporting-services/report-server-web-service/report-server-web-service

Happy Reporting!

Useful T-SQL Dates Reference

For short US datetime format use this:

select convert(varchar, getDate(), 1) --10/15/2017

For a list of most others (SQL Server), see: https://rauofthameem.wordpress.com/2012/09/14/sql-date-time-conversion-cheat-sheet/

On to the useful SQL dates:

select dateadd(mm, datediff(mm, 1, getDate()), 0) as FirstaTheMonth

select dateadd(ms, -3, dateadd(mm, datediff(m, 0, getDate()) + 1, 0)) as LastaTheMonth

select dateadd(qq, datediff(qq, 0, getDate()), 0) as FirstDayOfQtr

select dateadd(wk, datediff(wk, 0, dateadd(dd, 6 - datepart(day, getDate()), getDate())), 0) --(First Monday of Month)

Great T-SQL References:

http://www.gehbeez.com/sql-get-first-and-last-day-of-a-week-month-quarter-year/

http://brianvanderplaats.com/cheat-sheets/T-SQL-Cheat-Sheet.html

TDD in small, digestible chunks

Test-Driven Development becomes (necessarily) much more common as code bases grow larger, diverse and more mission critical. Every piece of logic that can be explained in your software's documentation should be testable.

Write the tests, then write the code to meet those conditions, refine, repeat

Though 100% code-coverage is rarely attainable, the closer to fully covered with unit and integration tests, the more secure and solidly structured the code will be. Moreover, the dependency nightmares that many teams wrestle with will be largely mitigated as code reliability checks take place daily when TDD is fully embraced and implemented well.

I am assuming you know the basics of "why" and "how" TDD is done. If TDD does not feel natural, you may want to check out the book Test-Driven Development by Kent Beck. Reading that, what was most impressionable to me and has aided my TDD development style are the following quotes:

"Return a constant and gradually replace constants with variables until you have the real code". -on testing complex functions or objects

"Write tests you wish you had. If you don't- you will eventually break something while refactoring". -on code bases with limited test coverage.

Do not write a few large tests to evaluate multiple ideas/functions at once. Write many small tests that test something that can be said in one short sentence: "Password Reset form successfully sends Reset email to appropriate user".

Write all the tests you know you will need and the tests you wish you had. You will wind up with quite a lot of code, but also quite a lot of "proofs", and you will be able to more confidently add to and refactor your highly-maintainable TDD-developed code.

Python 3 for fun - Screen Scraping

Download Python 3: https://www.python.org/downloads/

Open cmd, type: python

from lxml import html
import requests

#Initialize your page and tree list variables
page = requests.get('http://econpy.pythonanywhere.com/ex/001.html')
tree = html.fromstring(page.content)

See your page variable's output:

(Excerpts taken from Python.org, I've updated the instructions for Python 3+)

#This will create a list of buyer elements from the DOM (divs with a specific 'title' attribute):
buyers = tree.xpath('//div[@title="buyer-name"]/text()')
#This will create a list of price elements from the DOM (spans with a specific class):
prices = tree.xpath('//span[@class="item-price"]/text()')

print ('Buyers: ', buyers)
print ('Prices: ', prices)

Expected output:

You can get a lot of useful things done in Python with relatively few lines of code.

SQL NULL cannot be applied to arithmetic operations (select NULL > 1 = NULL)

"Any arithmetic expression containing a null always evaluates to null. For example, null added to 10 is null." -http://docs.oracle.com/cd/E11882_01/server.112/e26088/sql_elements005.htm#i59110

In SQL Server, this behavior can result in string concatenation failing due to one of the values being NULL, which, as is NULL's nature, turns the entire variable into NULL.

....results in: NULL

Two easy ways to work around* the inability to evaluate NULL are as follows:

(COALESCE takes first non-NULL value specified whereas ISNULL takes just one alternate value)

Different database platforms have different rules for NULL but arithmetic ops will not work in any, as NULL is only ever NULL, and NULL is an unknown. If you are checking for the existence of data, you should be using:

select top 5 m.Column1 FROM MYTABLE m WHERE m.Column1 IS NOT NULL

*you can do this in lieu of not getting bad data in the 1st place, obviously- if the column in question needs to hold a legitimate numeric value for calculation.. your app probably should not be generating and storing null values into that column.

Invisible characters from pasted data

If you find yourself debugging an ETL task error where the problem at hand is data type mismatch or some other data error that is not visible to you in the source (culprit) data, don't fret.

What has likely happened is invisible formatting characters got into the data.

To remedy things, all you'll need to do is open up Notepad++, paste your data into the text window, and from the View menu dropdown, select "Show Symbol" >> "Show All Characters".

Simply edit the data in Windows Notepad (remove the special characters), save, and try to import again. Your data should be sufficiently scrubbed clean and your ETL task will return to normal.

There is also a shortcut button in Notepad++:

PS: handy tip for Windows users- to remove all formatting from copied data, paste the text into Notepad. All formatting will be removed from the data. Re-select it and go paste where needed.

PureText is another application that handles removing all formatting from text: https://stevemiller.net/puretext/