DevOps is an emerging IT role but also a culture; and changing culture does not happen overnight
Elevate empowerment and give courage to people to "raise hand" and volunteer to fix things. Even if they are wrong- interest in areas outside one's primary role is good for the entire DevOps culture.
Shared accountability - no blame game between developers, PMs, testers, QA, etc.
Do not want to have to be dependent on "lone genius" or "firefighter"- need to share and transfer knowledge.
Offer time to learn.Encourage hacking for new features and for hardening security and find bugs.
Embrace failures in retrospectives to prevent repeated mistakeson future work.
Provide the right incentives to motivatve the values you want to reward: reward delivery of quality vs. fire fighting.
Understand "value streams"(esp. value stream bottleneck and how can we optimize all constraints) to know where to spend time accordingly.
Focus on CONSTRAINTS.
Avoid Configuration Drift- Config changes should cascade to all environments (QA, DEV, TEST, STAGE, PROD).
Automate the Path to Production.
Use pull-based systems so that people integrate each others changes and learn how everything works in unison/concert.
People should not fear for their jobs- Systems Admin becomes more important not less so, in DevOps
DevOps is how you work not just what you buy or what tech you are using.
Total adoption happens in stages/iterations.
Traditional Project and Documentation mindset is outmoded, outdated, and disconnected from a living IT mission.
Lack of DevOps leads to waste and waiting (waiting for ppl with the right skills to work on things vs. having team that can easily shift contexts or frameworks/languages).
SHARE KNOWLEDGE AND EMPOWER COLLEAGUES TO DO MULTIPLE TASKSAND UNDERSTAND SOME OR ALL ASPECTS OF MULTIPLE RESPONSIBILITIES OF THE SOFTWARE PROCESS CHAIN.
Do you need a jet engine to get your results from point A to B or will a modest GM sedan suffice?
I'll take the above bad analogy further and posit that while sedans and cars on the ground require stringent rules and have to navigate much more rigid structures, a jet engine simply powers the jet ahead through constraint-less skies- it's purpose is to power something big- not to be concerned with other machinery of the craft (ie. RDBMS features eschewed by NoSQL solutions).
Do you need massive global data sync scale so that millions can connect and make changes and the results all (appear) real-time? If not, NoSQL is not always the right choice and neglecting to have any kind of schema for stored application data structures can present its own host of challenges in the future if (when) those structures change. But alas, you can use NoSQL for some things (Redis, image/BLOB storage) and an RDBMS for others (more structured records and things you want to restore to a point in time in the event of a server failure).
Relational databases tend to be scaled up; NoSQL solutions are scaled out
The SQL vs NoSQL (structured and transactional vs. semi-structured and "eventually consistent") debate is not a matter of one or the other and that's that. These are complementary technologies and should both be used- wherever you find app requirements that suggest one or the other makes the most sense.
When an application is meant to scale immensely and there is not a lot of data integrity, consistency, transaction or complex data structuring and transformation needs- NoSQL is your best bet and will far outscale even the most robust RDBMS server farm- at least at a much lower cost (at the cost of sacrificing features of an RDBMS which may not be needed).
I have personally worked on several projects that utilize relational and unstructured approaches to reading and persisting application data. If you have ever used an application's config file to change a setting in JSON or XML or a simple line entry- you are seeing a small and very basic NoSQL example of storing app data.
Hadoop and other distributed NoSQL db servers are built for scalability
Using NoSQL in software development can make data structures and objects- passed to and fro from APIs and within the application itself- much more flexible to work with. 1-line to serialize an object to JSON chunk, save it to BLOB storage and forget about it.
When dealing with relational data, you really have to understand the data to write good data access code and the underlying SQL that supports well-defined structuring of complex objects.
Well-defined structuring of the persistence of complex application objects avoids data duplication/corruption, prevents breaking reference constraints and losing any sense of hierarchical data relationships and more generally lets you know very quickly when you have a problem within your data storage structures and the objects that initialize themselves from that data.
NoSQL ditches virtually all relational database data normalization rules in favor of a loosely schema'd unstructured (document, BLOB, KeyStore, etc.) data store that relies solely on keys, values and filtering unstructured metadata to get the same SELECT ... WHERE functionality found in RDBMS. Its iterations usually bear a resemblance to Java and as loosely follows here are common SQL statements and their Java or Java-derived equivalent:
"Apache Hadoop is an open source platform built on two technologies: Linux operating system and Java programming language."
For many application data requirements however, relational data can be overkill and totally unnecessary (ie. Redis to store key/vals vs. designing some elaborate key/val store in a SQL Server table).
Implementing something like Splunk or Kibana to continously index app logs and configuration files can help you dip toes into the Lake of Dark Data
The best distributed NoSQL solutions like Hadoop really shine in their inherit ability to dynamically scale to as many server machines as the operators can make ready to serve as "Hadoop processor nodes on standby".
SQL Server scaling is based more on server augmenting or "scaling up" (adding RAM, faster SSDs, RAID Arrays, etc.) rather than distributing workloads across dynamic nodes. SQL Server AlwaysOn Availability and its Mirroring and Replication feature are for recoverability and data sharing- not dynamic scaling to handle bigger and bigger workloads.
SQL has been around forever. The fundamental concept behind NoSQL (semi-structured or loosely structured data) has been around since long before SQL relational database technology. Both (along with NoSQL-related graph database paradigm) will continue to serve as viable data storage solution alternatives for many more years into the 21st century.
Relational systems like SQL Server, MySQL and Oracle usually handle structured side; NoSQL vendors are after the other 90%
In fact, SQL Server 2019's Polybase extension supports Hadoop Clusters, MongoDB and Terradata T-SQL query integration. A new feature called SQL Server Big Data Clusters helps make distributed NoSQL nodes manageable within SSMS environment.
Mongo, Hadoop and other NoSQL database servers have SQL server integration to support relational data sources.
SQL Server 2019 Polybase integrates Hadoop, MongoDB and many other sources with relational data and T-SQL queries
CAP Theorem: a distributed data system like most all NoSQL solutions can only achieve 2 of the 3 features: "Consistency", "Availability" and "Partition Tolerance"
ACID vs BASE: The relational axiom of "Atomic, Consistent, Isolated, Durable" contrasted against NoSQL's vague promise of "Basically Availability, Soft State, Eventual Consistency" (dirty reads common)
This ol' tried-and-true database server software ain't going away in the foreseeable future
Hundreds of millions of corporate, mid and small business applications are running along just fine in 2019 using various RDBMS platforms (SQL Server, Oracle, DB2, PostgreSQL, MySQL, etc.) for at least one of their data stores.
Many more millions of applications have been using one riff or another of NoSQL (semi-structured data) before, during and after the mythical "Relational Movement" as described by software veteran Robin Bloor:
"The Relational Model of Data Never Dominated Anyway. Estimates vary, but it is generally agreed that somewhere between 70% and 95% of the world’s data is stored only in poorly structured or unstructured formats such as: word processing documents, spreadsheets, HTML files and e-mail. The truth is that Relational database never did really dominate. It was rejected out of hand, year after year, as an effective store for many types of data."-Robin Bloor on insideanalysis.com
Google search trends over the last 5yrs certainly suggest relational SQL is not going anywhere anytime soon...
Considerations when evaluating whether to use NoSQL:
NoSQL is a precise tool for precise data needs; if relational SQL is too much for your group, NoSQL will likely be too steep a learning curve
Data Integrity- when billions of NoSQL records are affected by a small change in schema that is not able to propagate correctly or runs into constraint issues or hierachy and relations are impossible to infer... maybe relational SQL would be a better approach
NoSQL touts loose schema structure is a benefit but this simply means schema and data structure enforcement has been shifted from the database layer to the application layer. Data cannot "self-manage".
Some apps are prime candidates for NoSQL's document-centric and resource-centric distributed storage architecture
Also, there is this to consider:
(re: the longevity and simple-yet-powerful abstractions of SQL)
If NoSQL solutions are eventually able to achieve the same transactional consistency and complex schema structures that some applications require and then ultimately subsume RDBMS completely- it'll still require a lot of SQL gurus to convert and integrate all the legacy relational database apps for a long, long time to come...
Bring on MongoDB, CouchDB, Dynamo, MapReduce, HBase, BigTable, Cassandra.
As data professionals we will have an increasingly complex array of tools to understand; what we do with them will drive the future
As much as I have enjoyed working with log4net in the psat NLog works much more seamlessly for .NET apps (it has good, solid abstractions)
From Nuget Package Manager, you can find and reference NLog and the accompanying NLog.Config which simplifies setup. To configure NLog logging, simply point the required properties (you need at least a logFile variable, a target and a rule to get started) to your desired values in the config file that NLog.Config creates in your project root (NLog.config):
This example uses a log file target in C:\NLog directory; you can utilize a wide variety of logging targets to broadcast app errors
Source Code:
using Microsoft.VisualStudio.TestTools.UnitTesting;
using NLog;
namespace ExtRSTests
{
[TestClass]
public class NLogTests
{
private static Logger logger = LogManager.GetCurrentClassLogger();
[TestMethod]
public void TestLoggerToFile()
{
logger.Warn("Something in the app happened that may indicate trouble ahead....");
logger.Error("Uh-oh. Something broke.");
}
}
}
With the logging levels, log formatting (timestamps) and abundant integration options, NLog is a complete logging solution
With NLogger you can implement logging for an array of targets including file, email, database and 3rd party integrations (ie. send message to Slack channel if logger generates any "Error" or "Fatal" level log message).
Most API payloads are in XML or JSON; it is best to know both of these data structures, and how to serialize/deserialize them
The JSON parsing utilities found in Newtonsoft.Json are very. very useful and should be common knowledge for any .NET developer who works with API data or anything producing or derived from JSON (JavaScript Object Notation).
In general, to use Newtonsoft.Json you simply need to create a .NET class hierarchy that mimics the structure and hierarchy of the target JSON. Once that is setup, serializing in-memory objects to JSON and deserializing the JSON back to in-memory objects is a breeze.
You achieve this by normal class hierarchy and making List<> of child objects, array properties, etc. Newtonsoft's 'JsonProperty' class propertydecorator maps JSON properties and the builtin serialization and deserialization methods facilitate working between JSON strings and the in-memory objects they represent.
The C# source code below demonstrates serialization from a SQL Server 2017 SSRS API v2 JSON response and then serializing that object back into JSON.
Source Code:
using System;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using Newtonsoft.Json;
using System.Collections.Generic;
using System.Net.Http;
using System.Threading.Tasks;
namespace DemoTests
{
[TestClass]
public class DemoTestJSON
{
[TestMethod]
public async Task TestDeserializeJSON()
{
HttpClient client = new HttpClient(new HttpClientHandler() { UseDefaultCredentials = true });
client.BaseAddress = new Uri("http://localhost/reports/api/v2.0/reports");
var response = await client.GetAsync(client.BaseAddress);
var deserial = JsonConvert.DeserializeObject<APIGenericItemsResponse>(await response.Content.ReadAsStringAsync());
TestSerializeJSON(deserial);
Assert.IsNotNull(deserial);
}
[TestMethod]
public void TestSerializeJSON(APIGenericItemsResponse genericObject)
{
string serial = JsonConvert.SerializeObject(genericObject);
Assert.IsNotNull(null);
}
}
public class APIGenericItemsResponse
{
[JsonProperty("@odata.context")]
public string Context { get; set; }
[JsonProperty("value")]
public List<GenericItem> GenericItem { get; set; }
}
public class GenericItem
{
[JsonProperty("Id")]
public string Id { get; set; }
[JsonProperty("Name")]
public string Name { get; set; }
[JsonProperty("Path")]
public string Path { get; set; }
}
}
SSRS API v2 /Reports JSON Response:
JSON Deserialization with Newtonsoft.Json:
The deserial variable holds an in-memory .NET object of type APIGenericResponse, derived (deserialized) from the SSRS API JSON response
JSON Serialization with Newtonsoft.Json:
The serial variable is simply the serialization APIGenericItemsResponse object serialized into a JSON string
The base keyword in C# allows a subclass to access base (superclass) members.
All credit to Suresh Dasari of Tutlane (reference below) on explaining this so effectively in just a few steps of code.
What is shown here is the Details subclass overriding the Users base class' "GetInfo()" method and including the base behavior (Console.WriteLine("Name: {0}", name); ... Console.WriteLine("Location: {0}", location); - along with- some new behavior (Console.WriteLine("Age: {0}", base.age);). In this way members can be shared between subtypes and the type they inherit from- in constructors as well as elsewhere in the subclass.
using System;
namespace Tutlane
{
// Base Class
public class Users
{
public string name = "Suresh Dasari";
public string location = "Hyderabad";
public int age = 32;
public virtual void GetInfo()
{
Console.WriteLine("Name: {0}", name);
Console.WriteLine("Location: {0}", location);
}
}
// Derived Class
public class Details : Users
{
public override void GetInfo()
{
base.GetInfo();Console.WriteLine("Age: {0}", base.age);
}
}
class Program
{
static void Main(string[] args)
{
Details d = new Details();
d.GetInfo();
Console.WriteLine("\nPress Enter Key to Exit..");
Console.ReadLine();
}
}
}
Knowledge of network configuration and administration is an (incredibly- still) underrated, underappreciated and immensely powerful tool for any IT professional to possess.
All subnet masking schemes, the mask bits in binary, available number of hosts. A "/24" is common for small LAN subnets.
One area of computer networking that should be more well-understood by software developers is the configuration of subnetworks via subnet masks. A subnet mask (ie. 255.255.255.0) is simply a way of re-purposing an IP Address by segmenting it into network and host portions.
An IPv4 address consists of 4 bytes (32 bits) of data. Each of those bytes contain 8 bits known as "octets". In a 255.255.255.0 subnet mask- all but the last octet is being used for the network ID portion of the IP address and so are ignored.
At this point we could get into the logical ANDing of IP address bits and subnet mask bits but just be aware that the masking bits allow for the network portion of the IP address to be separated from the host portion- that is they key purpose of subnetting and the subnet mask.
The breakdown of a Class B IPv4 address
The subnet mask is designed to denote the number of bits in an IP address (ie. 10.9.1.14) that form the network portion (10.9.1) vs. the host portion (.14).
In this way, IPs can be used in ways they were not originally designed- but that are altogether needed for proper organization of something that has grown as seemingly unwieldy as IP networks of "the Internet" (publically accessible networks of subnetworks). With a little reference knowledge you can understand even the trickiest of subnet configurations.
But wait- there is (lots) more...
The example above illustrates only a very basic subnetting situation.
Where things get tricky is when a subnet mask ends not at the end of an entire octet, but just before the start of the host portion of the IP- in the same octet (ie. 255.255.128.0). In more complex network configuration scenarios it is helpful to refer to a subnet configuration reference sheet like the following to identify the subnet and/or subnet mask information you are looking for:
Describing the nature of a /29 subnet solely from knowing the IP address (10.1.1.37) of one of its hosts and that it is a /29 subnet.
Below are the 7 common pieces of information that you will need to know when analyzing subnet configurations:
Network ID:First available IP address in the subnet. Broadcast IP:Last available address in the subnet. First Host IP: Network ID + 1 Last Host IP:Broadcast IP - 1 Next Network:Broadcast + 1 # of IP Addresses:Number of IP addresses in the subnet range (subtract 2 to find the number of "usable" device IP addresses) - refer to the Subnet Mask Reference Sheet
This enlightening example shows how MCI uses 11 bits of mask, Automation Research Systems 22 bits, ARS 24 bits, freesoft.org 32 bits- all on the same IP address; you can see the subnet hierarchy as MCI controls the entire 208.128.0.0/11 network
Online CIDR Calculator showing MCI subnet breakdown which includes the other 3 subnets shown
IP Points to remember:
IP octets (base 10 representation) are 0-inclusive so only ever max of .255 in any given octet.
Subnet Mask is a 32-bit number that indicates how many bits of an IP address are used to indicate the network portion vs. host portion and is a way to subdivide networks for organization, security and manageability.
The first two available host addresses are network (generally .0), then router (generally .1) and the last available host address (generally .255) is used as the subnet's broadcast address- note these example octets are small LAN defaults/generalities and likely will not apply to a complex subnet.
Class A (0-127) uses 8 bits for the network portion of the IP address, leaving 24 bits for host IDs
Class B (128-191) uses 16 bits for the network portion of the IP address, leaving 16 bits for host IDs
Class C (192-223) uses 24 bits for the network portion of the IP address, leaving 8 bits for host IDs
CIDR is the acronym for Classless Inter-domain Routing. It (/26, /24, etc.) is just the number of IP address bits used by the subnet mask (255.255.255.0 = /24 or 24 bits of mask, .255.255.255.192 = /26 or 26 bits of mask).
When sorting through IP ranges to determine which range a particular subnetwork group is in, use these time saving tricks recommended by PracticalNetworking:
(1) multiply group size by 10 as a (*10) multiple of the group size will be reached
(2) if multiplying group size by 10 goes beyond the IP address for which you are trying to find the subnetwork range, remember that "every group size will land on 128 eventually"- so you can use that for a starting basis as well.
(3) every group size lands on the subnet value of the selected subnet and every subnet to the left of it (ie. for a /27 subnet or ".224" subnet mask- .224, .192 and .128 will all match the start of a group)
"The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the quantities represented." -Edward Tufte
It is amazing how easy it is to find highly inaccurate and misleading data graphics and charts even in this year 2019. These inaccuracies and sometimes outright perversions of the truth are of particular concern to an insta-culture who gets its news in headlines, memes, charts and other bite-sized generalizations via social media and rarely looks for the evidence beyond the headlines and the source data behind the charts.
The “Lie Factor”, first defined by American statistician Edward Tufte is defined as "a value to describe the relation between the size of effect shown in a graphic and the size of effect shown in the data." A larger Lie Factor value indicates a higher level of deception or "inaccurate scaling/weighting".
Lie Factor in Action:
The numbers do not equate to the scale of the bars and money bags... not quite as "strong" as projected.
This example mixes 2 different scales and data sets and only serves to confuse the reader...
This is a propaganda data graphic displaying a series of 5 increases using a totally nonsensical scale
This graphic shows Last Year, Last Week, and Current Week as having the same temporal scale.... O'Lie Factor.
Lie Factor Breakdown:
Lie Factor is the change shown in the graphic (say 100%) divided by the change reported in the data (say "50%") - (100/50 = a LF of 2)
There are reasons for misleading graphics that go beyond propaganda and sensationalist news articles:
Lack of quantitative skills on the part of the graphic creator and publication editor
Doctrine that statistics are boring and therefor need to be "jazzed up"
Doctrine that graphics are only for unsophisticated and so don't need "accuracy constraints"
Failure to treat graphics with the same fidelity to the truth as the written word it accompanies
Other ways that graphical information displays are corrupted include cherry-picking data, making small changes appear large by showing a small scale interval and when all else fails for information manipulators- using fake data.
It is important to not jump to conclusions when assessing graphical information displays even if it is coming from a reputable publisher. As you can see it is not always obvious that the information being communicated graphically is accurate. Wherever possible, get a look at the source data.
"When we see a chart or diagram, we generally interpret its appearance as a sincere desire on the part of the author to inform. In the face of this sincerity, the misuse of graphical material is a perversion of communication, equivalent to putting up a detour sign that leads to an abyss" - Wainer
The Google Maps API is a very powerful tool that is relatively easy to use if you have some JavaScript background. The below screen was created with the code that follows (Google API Key obfuscated).
Basic Maps API example with hard-coded lat/long custom markers
<!DOCTYPE html>
<html>
<head>
<title>Custom Markers</title>
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=no">
<meta charset="utf-8">
<style>
/* Always set the map height explicitly to define the size of the div
* element that contains the map. */
#map {
height: 100%;
}
/* Optional: Makes the sample page fill the window. */
html, body {
height: 100%;
margin: 0;
padding: 0;
}
</style>
</head>
<body>
<div>
Debug/Inspect...
</div>
<div id="map"></div>
<script>
var map;
function initMap() {
map = new google.maps.Map(
document.getElementById('map'),
{center: new google.maps.LatLng(43.0589, -88.0988), zoom: 11, mapTypeId: 'hybrid'});
var icons = {
bucks: {
icon: 'bucks.png'
},
brewers: {
icon: 'brewers.png'
},
panthers: {
icon: 'panthers.jpg'
}
};
var features = [
{
position: new google.maps.LatLng(43.0280, -87.9712),
type: 'brewers'
}, {
position: new google.maps.LatLng(42.9930, -87.9210),
type: 'panthers'
}, {
position: new google.maps.LatLng(43.0000, -87.8379),
type: 'bucks'
}
];
// Create markers.
for (var i = 0; i < features.length; i++) {
var marker = new google.maps.Marker({
position: features[i].position,
icon: icons[features[i].type].icon,
map: map
});
};
//Must have API with access to the Places API to get marker click details without much manual code
}
</script>
<script async defer src="https://maps.googleapis.com/maps/api/js?key=XXXXXXXXXXXXXXXXXXXX&callback=initMap">
</script>
</body>
</html>
I once in the not-too-distant past had a Developer interview go comically off the rails when I (the interviewee) was posed with the simple software development question:
"Describe the difference between a value type variable and a reference type variable"
This is an elementary Comp Sci 101 question that every developer should be able to understand and answer with unflinching confidence.
This is what value type (int a left) and reference type (object a on the right) memory allocation looks like graphically
The correct answer would point out that value type variables ('primitive' in Java) are stored directly in memory within the Stack while reference type variables contain pointers (id) to another memory location in the Heap (which may in turn contain additional memory references (pointers) that compose the entire object).
And (no I swear really) at one point I knew exactly what the difference was. However with time and as development languages and tools become more and more abstracted away from the underlying behavior of memory during application run-time, my knowledge of this simple-yet-important concept went blank.
I stammered though a meandering non-answer about how value types contain "simple" data types that are known as primitives and how reference types refer to more complex objects. Suffice it to say that it was an embarrassing lack of clarity and awareness of how application memory works and likely torpedoed any chance of an offer.
On value and reference types, the Stack and the Heap and Boxing/Unboxing (converting between val and ref)
This is something we should all know as developers. I hope this helped refresh and/or clarify the concept of variable memory allocation for at least a few people out there.
ETL is the process by which you can take (Extract) data from various (usually related) data sources, Transform that data to meet your destination system's needs, and finally Load that transformed data into the destination system data store.
Your table structure will be something along the lines of this basic template:
In a real-world db environment Staging, OLAP, OLTP and other data repos may be on different database servers, this is same db server for demonstration
We will use SQL Server Integration Services (SSIS) and develop the SSIS package within the Visual Studio 2017 IDE.
The first step of the SSIS package load (INSERT) the data into a STAGING area database. This allows us to:
Store off the intermediate data from all sources into analysis-friendly OLAP datastores
Perform data integrity checks
Keep extraction and transformation as two strictly separated steps
We load the data from the various source files (.csv, .xls, .xlsx) into SQL Server database Staging table(s) using SSIS Source and Destination Data Flow Tasks connected with the movable data flow arrows. Once you have connected a source and destination you can go into the Destination Data Flow Task and edit the mappings of which source columns should be written to which destination columns.
Next we perform some transformations. This can be anything from a simple ranking or status/flagging/business prioritization algorithm to data cleansing to data partitioning based on certain criteria; the key is that this Transform step is where we apply T-SQL UPDATEs to transform the data once it has all been aggregated in Staging.
Then we refresh the OLAP destination tables using the same kind of Source and Destination Data Flow Tasks and mappings as used for Staging. The OLAP data is used for data analysis.
Finally, we load the cleansed Staging data into our destination system's OLTP database and email or text message the system owner upon successful completion of the SSIS ETL job (or deliver an error if anything fails). The OLTP data stores live transactions.
Bear in mind that most ETL data-flow step mappings are not a 1:1 match; this is just an e2e demo of SSIS ETL in most basic form
Happy ETL'ing, and be sure to watch out for cases of mysterious symbols/characters from miscellaneous data copied from other programs or from other system environments that were using a Language setting (codepage) which is incompatible with your ETL software. Bad data happens more than you think and as we say, GIGO.
Your end result looks like this (all Green Checkmarks indicates all was successful; I recommend using PaperCut for SMTP testing- super cool and useful product
I would attach or GitHub the source code (and will do so upon request) but SSIS project code has a lot of dependencies and can get quite messy for another to re-use project on even just a 'slightly different' machine.
Having used SSIS' now-deprecated predecessor "DTS" (SQL Server Data Transformation Services) and SSIS for many years I can attest to the fact that the best way to learn this product is by diving right in on your own and begin the creation of sources and destinations and source/destination connection managers and control flow events, and .NET integration, and exception event handlers, etc.
You will likely run into some ambiguous and not well-documented errors when developing in SSIS; but persist in your efforts and you can create a very powerful EDI system with the many capabilities of SSIS and the robust Scheduled ETL jobs that it can create.
Java or .NET? Why not both (when it is the only viable path)?
jni4net is a proven interop library for Java and .NET. Two brief examples developed by jni4net below merely require that you to specify the jni4net dependency in the (Visual Studio or Eclipse) project.
Calling Java from .NET
using java.io;
using java.lang;
using java.util;
using net.sf.jni4net;
using net.sf.jni4net.adaptors;
namespace helloWorldFromCLR
{
public class Program
{
private static void Main()
{
// create bridge, with default setup
// it will lookup jni4net.j.jar next to jni4net.n.dll
Bridge.CreateJVM(new BridgeSetup(){Verbose=true});
// here you go!
java.lang.System.@out.println("Hello Java world!");
// OK, simple hello is boring, let's play with Java properties
// they are Hashtable realy
Properties javaSystemProperties = java.lang.System.getProperties();
// let's enumerate all keys.
// We use Adapt helper to convert enumeration from java o .NET
foreach (java.lang.String key in Adapt.Enumeration(javaSystemProperties.keys()))
{
java.lang.System.@out.print(key);
// this is automatic conversion of CLR string to java.lang.String
java.lang.System.@out.print(" : ");
// we use the hashtable
Object value = javaSystemProperties.get(key);
// and this is CLR ToString() redirected to Java toString() method
string valueToString = value.ToString();
java.lang.System.@out.println(valueToString);
}
// Java output is really Stream
PrintStream stream = java.lang.System.@out;
// it implements java.io.Flushable interface
Flushable flushable = stream;
flushable.flush();
}
}
}
Calling .NET from Java
import net.sf.jni4net.Bridge;
import java.io.IOException;
import java.lang.String;
import system.*;
import system.Object;
import system.io.TextWriter;
import system.collections.IDictionary;
import system.collections.IEnumerator;
public class Program {
public static void main(String[] args) throws IOException {
// create bridge, with default setup
// it will lookup jni4net.n.dll next to jni4net.j.jar
Bridge.setVerbose(true);
Bridge.init();
// here you go!
Console.WriteLine("Hello .NET world!\n");
// OK, simple hello is boring, let's play with System.Environment
// they are Hashtable realy
final IDictionary variables = system.Environment.GetEnvironmentVariables();
// let's enumerate all keys
final IEnumerator keys = variables.getKeys().GetEnumerator();
while (keys.MoveNext()) {
// there hash table is not generic and returns system.Object
// but we know is should be system.String, so we could cast
final system.String key = (system.String) keys.getCurrent();
Console.Write(key);
// this is automatic conversion of JVM string to system.String
Console.Write(" : ");
// we use the hashtable
Object value = variables.getItem(key);
// and this is JVM toString() redirected to CLR ToString() method
String valueToString = value.toString();
Console.WriteLine(valueToString);
}
// Console output is really TextWriter on stream
final TextWriter writer = Console.getOut();
writer.Flush();
}
}
(verbose commenting by Pavel Savara, a jni4netcontributor)
You may find yourself with the need to integrate a .NET method within SQL Server to be called as a function. This usually happens when some relatively complex looping and modifying logic is a requirement of a SQL operation.
SQL is a great data language but it is not the right language for some tasks. Creating a SQL CLR from a .NET assembly may be the best approach to some unique situations (and there is a bonus in that, in many cases you can reuse existing .NET code).
Before creating the CLR object we need a .NET .dll; so first we create a basic .NET assembly compile in Release and copy the path the the compiled .dll:
This is our simple .NET CLR method with which we want to run within the SQL Server query execution engine
SQL CLR provides a way for you to integrate complex .NET methods within SQL Server
Import into SQL Server instance via SSMS*:
Select New Assembly...
...and then enter the path to your Release .dll
Create T-SQL function or stored procedure to serve as caller for the function and run it:
From here we can see all of the T-SQL code involved; the 3 SQL Server configuration conditions (shown in the 3 EXEC statements) are required
And that is all there is to it. Only use CLR functions when absolutely necessary as RDBMS's like SQL Server are designed to processes relational data in sets, and not to apply complex business logic on individual rows.
But if there is no other way- SQL CLRs could provide you a solution to your code/logic integration problems.