Saturday 2 October 2010

Dummy's Guide to Standardisation


I was going to write something about open and closed IT standards, but was struggling. It turns out to be rather hard to define exactly what open and closed are, when it comes to IT standards. That's why I postponed this post, but fortunately Dick Hirsch reminded me of it, so here it is

Why this post? Because we are growing towards a more and more standardised world. Cloud is going to hugely drive standardisation, and Social Media are going to hugely drive mutual understanding and co-operation. Even though I once said that these are two forces at work that will greatly disrupt IT 1.0, they will work together on one thing: unifying this little planet - and my business will contribute to that

Some standards just don't work, even though they spread like crazy: XML is one of them. The picture above, is an example of trying to standardise a language, and how that failed as a goal, although it created some slight spin-off - I have always said that XML will fail as a language, period

Interoperability is key: but it's not technical interoperability that we first need to agree upon, it's business interoperability of course. Sure it's tough explaining something to someone in China who doesn't speak English, but it's tougher having an argument with your wife and not understanding what she's saying

If we take XML (a thoroughly-bashed subject by me) as example, I can very well explain what the difference is between useless standards, and useful standards - wherein usefulness isn't measured by my personal liking or disliking.
Sure, XML makes for easy reading - please raise your hands if you read XML on a regular basis being a user of any application; but for machine-to-machine interaction, XML only offers disadvantages over any other language

Google doesn't like XML for internal use, saying
When all of your machines and network links are running at capacity, XML is an extremely expensive proposition
And they are right. XML is very bloated, and storage- and CPU-hungry - on average 15 times as much as any other semantics. Those that say it isn't so, are just completely ignorant of most other semantics out there. Twitter and Facebook are converging from XML to JSON, that says enough: XML doesn't scale, and offers no added value when it comes to machine-to-machine interfacing
Rule 1: A language (syntax) should be concise
Comparing that to natural languages, there are differences in length of words and sentences when translating, but not of this size. Books don't even double in size when translated, let alone become 15 times as thick

Sun gives training on the various (!) design patterns for XML schemas. Apparently there's a trade-off between designers and developers in the way a schema is organised, and there isn't a winner yet.
Choosing the appropriate pattern is a critical step in the design phase of schemas. Once you have made a choice, switching the pattern to another one without GUI tools is tedious and error-prone
Doesn't sound very comforting,does it? Any tools out there will have to support at least these 4 different schema patterns, and another few: some say there are 28 design patterns.
Rule 2: A language (syntax) should be simple
Comparing that to natural languages, there are basic rules for grammar. There may be more than one way to form a part of a sentence or sub-sentence, but not over two dozen ways to form an entire sentence

XBRL is a language, devised for business-to-business reporting. It is an utter failure, and only a few governments use it for a very few applications, after having thrown tens of millions of money at it. Not even making it mandatory to use for reporting to government agencies, helps its case much - and there is a clear reason for that: it is syntax-only, no semantics at all.
Those places that have turned it into a mild success, have added the semantics (though at a very basic level), such as the full distribution for XBRL US GAAP Taxonomy 2009
Rule 3: A language (semantics) should be clear and well-defined
Comparing that to natural languages, those contain dictionaries. There, all the known meanings of words are described

The real trick, however, is in Pragmatics - where complete sentences are "studied".
The men saw the girl with the binoculars
is the classic example there. Who has the binoculars, the girl or the men? If it's the girl, she might do something with the binoculars, or not, but if it's the men, they are highly likely using the binoculars to actually see the girl from a distance.
This is not so tough in English, but in many languages, nouns and verbs follow certain conjugations, and it's important to syntactically know where a word belongs.
In Dutch, it gets worse: the literal translation for 'saw' in this context is 'zagen', which is the past tense of 'to see', but also the current tense of 'to saw' (the same problem exists in English, to be honest, but in Dutch it is a perfectly normal sentence).
This could be perfectly well translated as the following two bottom pictures are showing:
The bottom left even makes sense, because we are used to that in magic shows (and horror films where there's never a happy ending to it), but the bottom right doesn't make sense at all - now try to explain that to e.g. a kid who's never seen that very magic show piece
Rule 4: A language (pragmatics) should make sense
Comparing that to natural languages, there is consense and wisdom beyond the dictionaries. That can't be captured or explained, "it just is" - and the language should allow for it

When defining a standard, start at points 4 and 3: that, the business should do. Keep in mind that you don't want to write a new dictionary, you just want to have a few meaningful conversations.
Define the topics, their bandwidth, and most importantly, what is mandatory, optional and conditional. If you're a minor, you must have a (legal) parent or a guardian - either one of those is mandatory. But if you're an adult, you may have one of those: in that case they're optional.
So, parent or guardian is not optional nor mandatory, they're conditional: depending on the age of a child being below or (equal or) above 18, a parent or guardian suddenly becomes mandatory or optional

After that entire exercise is done, and all the possible conversations are fully modeled, when all the business rules and exceptions are in place and addressed, then the fun exercise can start: what language to use? Follow rules 1 and 2, and you'll end up with a few, or less. Process-wise there are a few caveats as well, but it is rather easy really to define a language, just start small but wide, and look ahead but not too far

Last but not least: address addressing (sic): to exchange messages, you need to please the mailman. That wheel has been invented as well, ages ago: just pick any proven-technology envelope out there already

0 reacties:

Post a Comment

Thank you for sharing your thoughts! Copy your comment before signing in...