The sum of us

10 July 2017

What is the definition of the term “data scientist”…?

In my previous post, Painting by numbers, I offered a shorthand definition of data science based on what I could synthesise from the interwebs. Namely, it is the combination of statistics, computer programming, and domain expertise to generate insight. It follows, then, that the definition of data scientist is someone who has those skill sets.

Fat chance!

In this post I intended to articulate my observation that in the real world, incredibly few people could be considered masters of all three disciplines. I was then going to suggest that rather than seeking out these unicorns, employers should build data science teams comprising experts with complementary talents. I say “was” because I subsequently read this CIO article by Thor Olavsrud in which he quotes Bob Rogers saying, well… that.

Given Thor and Bob have stolen my thunder (18 months ago!) I think the only value I can add now is to draw a parallel with pop culture. So I will do so with the geeky HBO sitcom Silicon Valley.

The cast of Silicon Valley: Dinesh, Gilfoyle, Richard, Jared and Erlich.

If you aren’t familiar with this series, the plot revolves around the trials and tribulations of a start-up called Pied Piper. Richard is the awkward brainiac behind a revolutionary data compression algorithm, and he employs a sardonic network engineer, Gilfoyle, and another nerdy coder, Dinesh, to help bring it to market. The other team members are the ostentatious Erlich – in whose incubator (house) the group can work rent-free in exchange for a 10% stake – and Jared, a mild-mannered economics graduate who could have been plucked from the set of Leave It to Beaver.

The three code monkeys are gifted computer scientists, but they have zero business acumen. They are entirely dependent on Jared to write up their budgets and forecasts and all the other tickets required to play in the big end of town. Gilfoyle and Dinesh’s one attempt at a SWOT analysis is self-serving and, to be generous, NSFW.

Conversely, Jared would struggle to spell HTML.

Arguably the court jester, Erlich, is the smartest guy in the room. Despite his OTT bravado and general buffoonery, he proves his programming ability when he rolls up his sleeves and smashes out code to rescue the start-up from imploding, and he repeatedly uses his savvy to shepherd the fledgling business through the corporate jungle.

Despite the problems and challenges the start-up encounters throughout the series, it succeeds not because it is a team of unicorns, but because it comprises specialists and a generalist who work together as a team.

Unicorn silhouette

And so the art of Silicon Valley shows us how unlikely we would be in real-life to recruit an expert statistician / computer programmer / business strategist. Each is a career in its own right that demands years of education and practice to develop. A jack-of-all-trades will inevitably be a master of none.

That is not to say a statistician can’t code, or a programmer will be clueless about the business. My point is, a statistician will excel at statistics, a computer programmer will excel at coding, while a business strategist will excel at business strategy. And I’m not suggesting the jack-of-all-trades is useless; on the contrary, he or she will be the glue that holds the specialists together.

So that begs the question… which one is the data scientist?

Since each is using data to inform business decisions, I say they all are.

Painting by numbers

3 June 2017

A lifetime ago I graduated as an environmental biologist.

I was one of those kids who did well in school, but had no idea what his vocation was. As a pimply teenager with minimal life experience, how was I to know even half the jobs that existed?

After much dilly dallying, I eventually drew upon my nerdy interest in science and my idealistic zeal for conservation and applied for a BSc. And while I eventually left the science industry, I consider myself extremely fortunate to have studied the discipline because it has been the backbone of my career.

Science taught me to think about the world in a logical, systematic manner. It’s a way of thinking that is founded on statistics, and I maintain it should inform the activities we undertake in other sectors of society such as Learning & Development.

The lectures I attended and the exams I crammed for faded into a distant memory, until the emergence of learning analytics rekindled the fire.

Successive realisations have rapidly dawned on me that I love maths and stats, I’ve floated away from them over time, the world is finally waking up to the importance of scientific method, and it is high time I refocused my attention onto it.

So it is in this context that I have started to review the principles of statistics and its contemporary manifestation, analytics. My exploration has been accompanied by several niggling queries: what’s the difference between statistics and analytics? Is the latter just a fancy name for the former? If not, how not?

Overlaying the post-modern notion of data science, what are the differences among the three? Is a data scientist, as Sean Owen jokingly attests, a statistician who lives in San Francisco?

The DIKW Pyramid

My journey of re-discovery started with the DIKW Pyramid. This beguilingly simple triangle models successive orders of epistemology, which is quite a complex concept. Here’s my take on it…

The DIKW Pyramid, with Data at the base, Information a step higher, Knowledge another step higher, and Wisdom at the peak.

At the base of the pyramid, Data is a set of values of qualitative or quantitative variables. In other words, it is the collection of facts or numbers at your disposal that somehow represent your subject of study. For example, your data may be the weights of 10,000 people. While this data may be important, if you were to flick through the reams of numbers you wouldn’t glean much from them.

The next step up in the pyramid is Information. This refers to data that has been processed to make it intelligible. For example, if you were to calculate the average of those ten thousand weights, you’d have a comprehensible number that is inherently meaningful. Now you can do something useful with it.

The next step up in the pyramid is Knowledge. To avoid getting lost in a philosophical labyrinth, I’ll just say that knowledge represents understanding. For example, if you were to compare the average weight against a medical standard, you might determine these people are overweight.

The highest step in the pyramid is Wisdom. I’ll offer an example of wisdom later in my deliberation, but suffice it to say here that wisdom represents higher order thinking that synthesises various knowledge to generate insight. For example, the wise man or woman will not only know these people are overweight, but also recognise they are at risk of disease.

Some folks describe wisdom as future focused, and I like that because I see it being used to inform decisions.


My shorthand definition of statistics is the analysis of numerical data.

In practice, this is done to describe a population or to compare populations – that is to say, infer significant differences between them.

For example, by calculating the average weight of 10,000 people in Town A, we describe the population of that town. And if we were to compare the weights of those 10,000 people with the weights of 10,000 people in Town B, we might infer the people in Town A weigh significantly more than the people in Town B do.

Similarly, if we were to compare the household incomes of the 10,000 people in Town A with the household incomes of the 10,000 people in Town B, we might infer the people in Town A earn significantly less than the people in Town B do.

Then if we were to correlate all the weights against their respective household incomes, we might demonstrate they are inversely proportional to one another.

The DIKW Pyramid, showing statistics converting data into information.

Thus, our statistical tests have used mathematics to convert our data into information. We have climbed a step up the DIKW Pyramid.


My shorthand definition of analytics is the analysis of data to identify meaningful patterns.

So while analytics is often conflated with statistics, it is indeed a broader expression – not only in terms of the nature of the data that may be analysed, but also in terms of what is done with the results.

For example, if we were to analyse the results of our weight-related statistical tests, we might recognise an obesity problem in poor neighbourhoods.

The DIKW Pyramid, showing analytics converting data into knowledge.

Thus, our application of analytics has used statistics to convert our data into information, which we have then translated into knowledge. We have climbed another step higher in the DIKW Pyramid.

Data science

My shorthand definition of data science is the combination of statistics, computer programming, and domain expertise to generate insight. Or so I’m led to believe.

Given the powerful statistical software packages currently available, I don’t see why anyone would need to resort to hand coding in R or Python. At this early stage of my re-discovery, I can only assume the software isn’t sophisticated enough to compute the specific processes that people need.

Nonetheless, if we return to our obesity problem, we can combine our new-found knowledge with existing knowledge to inform strategic decisions. For example, given we know a healthy diet and regular exercise promote weight loss, we might seek to improve the health of our fellow citizens in poor neighbourhoods (and thereby lessen the burden on public healthcare) by building sports facilities there, or by subsidising salad lunches and fruit in school canteens.

The DIKW Pyramid, showing data science converting data into wisdom.

Thus, not only has our application of data science used statistics and analytics to convert data into information and then into knowledge, it has also converted that knowledge into actionable intelligence.

In other words, data science has converted our data into wisdom. We have reached the top of the DIKW Pyramid.

The definition of Enterprise Social Network

26 August 2015

Enterprise Social Network, n. 1. A software platform that facilitates communication and collaboration among the employees of a company. 2. A means of liking senior executives' posts.

The paradox of augmented reality

27 May 2013

Sydneysider Scott O’Brien is back in town after an extended stint in San Francisco. Scott is the Co-founder & CMO of Explore Engage, a digital media company that is attracting serious attention for its augmented reality eyewear.

I caught up with Scott in the harbour city and asked him the following questions…

  • What are your favourite examples of augmented reality? (0:08)
  • What are you working on at Explore Engage? (2:20)
  • How does your eyewear differ from Google Glass? (3:05)
  • Is augmented reality worth the hype? (3:54)
  • What opportunities exist for the finance sector? (5:04)
  • What opportunities exist for workplace education? (6:14)

I was impressed with the examples of augmented reality cited by Scott.

Medical education has long been the poster boy of salivatingly engaging content, and the tradition continues with this emerging technology. Daqri’s 4D Anatomy app showcases the visualisation capabilities of the medium, while the Australian Defence Force not only targets a real-world need with their Mobile Medic app, but also incorporates it into their recruitment process.

Ingress Enlightened logo

Google’s Ingress is an augmented reality MMOG that exemplifies the gamification capability of the medium. Two factions fight for control over the real world by capturing virtual “portals” that are represented by public landmarks such as statues and fountains.

The fact that Ingress was developed by Google’s internal startup, Niantic Labs, is enlightening (excuse the pun). Augmented reality is still an emerging technology in which experiments must be undertaken and failures borne. It is by learning from the results, and responding to them via adaptation, that you increase the probability of break-through success.

I am also fascinated by Google’s marketing strategy with Ingress. The game is in “closed beta” mode, which means you need an invitation to play it. Reminiscent of Studio 54, only the members of the “in” crowd have the privilege of enjoying that which is denied to others. Google deepens the mystique by seemingly neglecting to promote the product – instead relying on organic growth of the subculture.

On the subject of Google, I think Scott’s differentiation between Google Glass and Explore Engage’s Augmented Reality Eyewear is an important one. While Google Glass has augmented reality capability, it is essentially a wearable computer with which digital information is conveniently presented in front of the wearer’s eye. In contrast, the Explore Engage eyewear is specifically designed to integrate digital information with the real-world background. There is no better example of the latter concept than BMW’s Augmented Reality Glasses – which aren’t Explore Engage’s by the way, but are oh so sexy all the same.

While I’m on my definitions soapbox, I’ll take this opportunity to point the finger at Star Chart. This is a wonderful (and free) app, but its so-called “augmented reality mode” is no such thing; it does not lay its stellar information over the night sky! In contrast, Sun Seeker lays the sun’s trajectory over the real background. In other words, it augments reality.


In terms of ROI, 2.5 million downloads of Transformers 3’s Defend the Earth speaks for itself. The return on Audi’s Virtual Q3 is less obvious, but that’s because it’s less about car sales and more about engaging consumers and associating the brand with innovation. How do you evaluate that? By analysing car sales of course, after the Q3 finally lands on Aussie shores.

While the Commonwealth Bank should be applauded for their Property Guide app, which combines geolocation with big data to provide something truly useful to their prospective customers, I must say as someone in the financial services industry: the general lack of financially oriented augmented reality apps represents a typical lack of imagination in the sector. Worse still, the examples highlighted by Infosys’s whitepaper are almost exclusively home finders and ATM locators, which means they’ve merely copied each other. Yawn.

As I am concurrently in the education profession, however, I must also recognise that the potential for augmented reality remains largely untapped. Scott’s examples attest to the power of the medium in terms of visualisation, gamification and performance support – which are factors that make education in the workplace engaging and effective. So what are we waiting for?

I think the mobility of the technology also remains under exploited. For example, how about an architecture tour of your local city in which details of buildings are highlighted when you point your mobile device at them? Or even better, when you look at them through your AR-enabled glasses?

And Scott’s mention of avatars adds more fuel to the fire of possibility. I imagine learning interventions in dangerous environments (such as mining sites) in which training can be undertaken in context, minus the threat to life or limb. Unlike in a simulator or a virtual world, the training is done at the workplace.

Therein lies the paradox of augmented reality. By complementing the real world with artificiality, it makes the learning experience more authentic.

Human enough

19 February 2013

It is with glee that the proponents of e-learning trumpet the results of studies such as the US Department of Education’s Evidence-Based Practices in Online Learning: A Meta-Analysis and Review of Online Learning Studies, which found that, on average, online instruction is as effective as classroom instruction.

And who can blame them? It is only natural for evangelists to seize upon evidence that furthers their cause.

But these results mystified me. If humans are gregarious beings and learning is social, how can face-to-face instruction possibly fail to out perform its online equivalent?

That was until I watched Professor Steve Fuller’s Humanity 2.0 TEDxWarwick talk in Week 3 of The University of Edinburgh’s E-learning and Digital Cultures course.

The professor explains with wonderful articulation how difficult it is to define a human.

Sure, biologists will define humanity in terms of DNA, yet they can’t even agree on whether the Neanderthals were a subspecies of Homo sapiens or a separate species all together.

If we remove our gaze from the electron microscope, we have our morphology. Perhaps a human is an organism that has five fingers on each hand? But does that mean someone who is born with four (or six) is not human?

Perhaps a human is an organism that uses tools? Well, vultures drop rocks onto eggs to break them open.

Perhaps then a human is an organism that uses language? Whales might have something to say about that.

It is an intriguing conundrum that has occupied our thoughts since anyone can remember.

Title page of the first edition of René Descartes' Discourse on Method.

In the 17th Century, René Descartes made an intellectual breakthrough. He contended that “reason…is the only thing that makes us men, and distinguishes us from the beasts”. In other words, we are the only creatures on God’s earth capable of rational thought. I think, therefore I am.

Descartes pushed his point by arguing that while a robot might one day be developed to speak words, “it is not conceivable that such a machine should…give an appropriately meaningful answer in its presence”. And despite astonishing advances in artificial intelligence, the philosophical Frenchman remains right. Even Watson, who triumphed at Jeopardy! and today mines big data to help humans make better decisions, can not reasonably be considered a human itself. It is simply a product of computer programming.

Speaking of machines, if a human were to progressively replace her body parts with robotics – hence becoming a cyborg – at what point does she cease to be a human? According to the humanist tradition of Descartes, the absolute difference between a human and a non-human is a property of the mind. So, arguably she will remain a “human” until her brain is replaced.

But that begs the question: if we flip the scenario around and place a person’s brain in a robot’s body, does that make it a human?

All this philosophy starts to do my head in after a while, and that’s before getting into Freud’s posthumanism.

Somehow I prefer Joseph Gliddon’s simpler definition of a human: something that drinks coffee.

Cup of coffee

It’s not as flippant as it sounds, for it is our artificial enhancements that paradoxically make us more human.

Riding a bicycle, for example, is a quintessentially human endeavour. No other creature does it. Yes, a monkey might do so in the circus, but the reason we find it funny (or at least unusual) is because it doesn’t normally do that. The poor thing is mimicking a human.

Similarly, digital technology is an extension of our notion of humanity. Humans are the only organisms that use computers, surf the Web, write text, film video, record audio, and engage with one another in online discussion forums.

So when we view online pedagogy through this lens, we recognise very little of it that is not human. Consequently the strong performance of online students becomes less mysterious. In fact, it becomes expected because, just as a bicycle enhances our capability for travel, digital technology enhances our capability for learning.

This expectation is supported by a further finding of the Department of Education’s research – namely, that “blends of online and face-to-face instruction, on average, had stronger learning outcomes than did face-to-face instruction alone”. In other words, students who had the technology via the blended design performed better than those who didn’t.

But it doesn’t work in reverse: “the majority of…studies that directly compared purely online and blended learning conditions found no significant differences in student learning”. In other words, those who had the face-to-face interaction via the blended design performed no better than those who didn’t. Apparently the online instruction was human enough.

OK, on that bombshell, I think I’ll ride my bike to the cafe and pick up a cup of joe…

Introducing the Social Intranet Index

9 July 2012

There’s a lot of talk about social intranets these days. It even threatens to overtake the blogosphere’s current obsession with gamification.

But what exactly is a social intranet…?

Everyone seems to have a different opinion, from a human-centred platform, to the intersection between portals, team sites and social sites, to a system that ties the business’s processes and data to the employee’s social behaviour.

Which one is correct? They all are.

You see, a “social intranet” is simply an intranet with social media elements that allow the users to interact with the content and with each other.

While everyone’s definition covers this functionality more or less, what is different is the degree of the functionality.

So, to introduce a common language and some standardisation to our discourse, I propose the “Social Intranet Index” (SII).

Smile Clusters

The Social Intranet Index is a metric that denotes the degree of social functionality afforded by an enterprise’s intranet. From 1 through to 10, the SII represents an increasing level of sociability…

1. An intranet with an SII of 1 is the traditional, old-fashioned broadcast medium. Its content is published by a select few (usually members of the Communications team) and remains read-only for the target audience.

2. An intranet with an SII of 2 accommodates special account holders outside of the golden circle. These are typically highly motivated individuals, because the backend is clunky and illogical.

Unfortunately these individuals tend to find themselves in the unenviable position of publishing content for other people, because said people are either too dumb or too lazy to learn how to do it themselves. Strangely, though, they all know how to use Facebook.

3. An intranet with an SII of 3 introduces a star rating or a “like” facility. The target audience can interact (albeit minimally) with the content by judging its quality and relevance.

4. An intranet with an SII of 4 introduces a commenting facility. Beyond a reductionist score, the target audience can now post free-form comments in response to the content.

5. An intranet with an SII of 5 bolts on third-party social applications such as Yammer, Compendium and Confluence. While these apps aren’t components of the enterprise’s intranet proper, they’re accessible from there and thus form part of the network. The target audience is empowered to generate their own content within these ringfenced zones.

6. An intranet with an SII of 6 integrates social media elements such as a discussion forum, blogs and wikis into a single sign-on solution. The user experience is seamless.

7. An intranet with an SII of 7 maintains a bank of user profiles that includes everyone in the organisation and is accessible by anyone in the organisation. The profiles are rich (including photos, contact details and subject matter expertise) and integrate with the other components of the intranet (eg the discussion forum) to facilitate social networking.

8. An intranet with an SII of 8 enables the users to personalise the interface. This typically involves the selection and arrangement of social widgets (eg a particular blog, a discussion sub-forum), a filterable activity stream, plus external functionality such as a customisable RSS feed.

9. An intranet with an SII of 9 empowers anyone in the organisation to publish and edit “regular” informational content beyond the aforementioned social media elements, though still within certain ringfenced zones. For example, a team site may host user-generated content pertinent to that team.

10. An intranet with an SII of 10 is the poster boy of heterarchy. All content is easily publishable and editable by everyone in the organisation. Devoid of ringfences, the platform effectively becomes a giant wiki. The corporate community pitches in to produce and maintain organic knowledge.

Outlandish and unworkable, or innovative and game changing? At the very least, I say an SII of 10 is aspirational.

Concurrent trends associated with the Social Intranet Index

From 1 to 10, the Social Intranet Index represents a series of concurrent trends.

Most radically, the direction of publishing shifts from one-way to two-way to multi-way. This is typically associated with an increasing ease of use, which in turn encourages an increasing number of content producers.

Knowledge contained in silos is increasingly shared, and a broader community blossoms. As governance loosens, the organisation puts more trust in its own employees. Effectively, its hierarchy flattens.

As more control is relinquished by the company to its people, however, the risk of something going wrong increases. The content that is generated by the users might be flawed, and in extreme cases an individual might abuse their privileges and do something malicious.

On the other side of the coin, though, loose governance does not mean no governance. Sensitive content may still be locked, while an approval process and a reversion facility can prevent disaster.

Moreover, it may be argued that the shifting paradigm places an increasing obligation on the SME not only to share their knowledge with the wider organisation, but also to maintain its currency and relevance. Those who can’t or won’t will soon get found out.

Business woman using computer

Clearly, a “social intranet” is not just about the technology; it’s about the culture of the organisation. Just because sophisticated functionality is available does not necessarily mean it will be used!

Notwithstanding this truism, I submit that culturally speaking, an SII of 1 is poles apart from an SII of 10. The former is characteristic of a restrictive, distrustful, clunky organisation, while the latter is characteristic of an open, empowering, nimble one.

Which organisation do you think will be more collaborative?

Which one is more adaptable to change?

Which one will ultimately perform better in the market?

Closer to home, what is the SII of your organisation’s intranet…?