Posted tagged ‘definition’

The sum of us

10 July 2017

What is the definition of the term “data scientist”…?

In my previous post, Painting by numbers, I offered a shorthand definition of data science based on what I could synthesise from the interwebs. Namely, it is the combination of statistics, computer programming, and domain expertise to generate insight. It follows, then, that the definition of data scientist is someone who has those skill sets.

Fat chance!

In this post I intended to articulate my observation that in the real world, incredibly few people could be considered masters of all three disciplines. I was then going to suggest that rather than seeking out these unicorns, employers should build data science teams comprising experts with complementary talents. I say “was” because I subsequently read this CIO article by Thor Olavsrud in which he quotes Bob Rogers saying, well… that.

Given Thor and Bob have stolen my thunder (18 months ago!) I think the only value I can add now is to draw a parallel with pop culture. So I will do so with the geeky HBO sitcom Silicon Valley.

The cast of Silicon Valley: Dinesh, Gilfoyle, Richard, Jared and Erlich.

If you aren’t familiar with this series, the plot revolves around the trials and tribulations of a start-up called Pied Piper. Richard is the awkward brainiac behind a revolutionary data compression algorithm, and he employs a sardonic network engineer, Gilfoyle, and another nerdy coder, Dinesh, to help bring it to market. The other team members are the ostentatious Erlich – in whose incubator (house) the group can work rent-free in exchange for a 10% stake – and Jared, a mild-mannered economics graduate who could have been plucked from the set of Leave It to Beaver.

The three code monkeys are gifted computer scientists, but they have zero business acumen. They are entirely dependent on Jared to write up their budgets and forecasts and all the other tickets required to play in the big end of town. Gilfoyle and Dinesh’s one attempt at a SWOT analysis is self-serving and, to be generous, NSFW.

Conversely, Jared would struggle to spell HTML.

Arguably the court jester, Erlich, is the smartest guy in the room. Despite his OTT bravado and general buffoonery, he proves his programming ability when he rolls up his sleeves and smashes out code to rescue the start-up from imploding, and he repeatedly uses his savvy to shepherd the fledgling business through the corporate jungle.

Despite the problems and challenges the start-up encounters throughout the series, it succeeds not because it is a team of unicorns, but because it comprises specialists and a generalist who work together as a team.

Unicorn silhouette

And so the art of Silicon Valley shows us how unlikely we would be in real-life to recruit an expert statistician / computer programmer / business strategist. Each is a career in its own right that demands years of education and practice to develop. A jack-of-all-trades will inevitably be a master of none.

That is not to say a statistician can’t code, or a programmer will be clueless about the business. My point is, a statistician will excel at statistics, a computer programmer will excel at coding, while a business strategist will excel at business strategy. And I’m not suggesting the jack-of-all-trades is useless; on the contrary, he or she will be the glue that holds the specialists together.

So that begs the question… which one is the data scientist?

Since each is using data to inform business decisions, I say they all are.

Painting by numbers

3 June 2017

A lifetime ago I graduated as an environmental biologist.

I was one of those kids who did well in school, but had no idea what his vocation was. As a pimply teenager with minimal life experience, how was I to know even half the jobs that existed?

After much dilly dallying, I eventually drew upon my nerdy interest in science and my idealistic zeal for conservation and applied for a BSc. And while I eventually left the science industry, I consider myself extremely fortunate to have studied the discipline because it has been the backbone of my career.

Science taught me to think about the world in a logical, systematic manner. It’s a way of thinking that is founded on statistics, and I maintain it should inform the activities we undertake in other sectors of society such as Learning & Development.

The lectures I attended and the exams I crammed for faded into a distant memory, until the emergence of learning analytics rekindled the fire.

Successive realisations have rapidly dawned on me that I love maths and stats, I’ve floated away from them over time, the world is finally waking up to the importance of scientific method, and it is high time I refocused my attention onto it.

So it is in this context that I have started to review the principles of statistics and its contemporary manifestation, analytics. My exploration has been accompanied by several niggling queries: what’s the difference between statistics and analytics? Is the latter just a fancy name for the former? If not, how not?

Overlaying the post-modern notion of data science, what are the differences among the three? Is a data scientist, as Sean Owen jokingly attests, a statistician who lives in San Francisco?

The DIKW Pyramid

My journey of re-discovery started with the DIKW Pyramid. This beguilingly simple triangle models successive orders of epistemology, which is quite a complex concept. Here’s my take on it…

The DIKW Pyramid, with Data at the base, Information a step higher, Knowledge another step higher, and Wisdom at the peak.

At the base of the pyramid, Data is a set of values of qualitative or quantitative variables. In other words, it is the collection of facts or numbers at your disposal that somehow represent your subject of study. For example, your data may be the weights of 10,000 people. While this data may be important, if you were to flick through the reams of numbers you wouldn’t glean much from them.

The next step up in the pyramid is Information. This refers to data that has been processed to make it intelligible. For example, if you were to calculate the average of those ten thousand weights, you’d have a comprehensible number that is inherently meaningful. Now you can do something useful with it.

The next step up in the pyramid is Knowledge. To avoid getting lost in a philosophical labyrinth, I’ll just say that knowledge represents understanding. For example, if you were to compare the average weight against a medical standard, you might determine these people are overweight.

The highest step in the pyramid is Wisdom. I’ll offer an example of wisdom later in my deliberation, but suffice it to say here that wisdom represents higher order thinking that synthesises various knowledge to generate insight. For example, the wise man or woman will not only know these people are overweight, but also recognise they are at risk of disease.

Some folks describe wisdom as future focused, and I like that because I see it being used to inform decisions.

Statistics

My shorthand definition of statistics is the analysis of numerical data.

In practice, this is done to describe a population or to compare populations – that is to say, infer significant differences between them.

For example, by calculating the average weight of 10,000 people in Town A, we describe the population of that town. And if we were to compare the weights of those 10,000 people with the weights of 10,000 people in Town B, we might infer the people in Town A weigh significantly more than the people in Town B do.

Similarly, if we were to compare the household incomes of the 10,000 people in Town A with the household incomes of the 10,000 people in Town B, we might infer the people in Town A earn significantly less than the people in Town B do.

Then if we were to correlate all the weights against their respective household incomes, we might demonstrate they are inversely proportional to one another.

The DIKW Pyramid, showing statistics converting data into information.

Thus, our statistical tests have used mathematics to convert our data into information. We have climbed a step up the DIKW Pyramid.

Analytics

My shorthand definition of analytics is the analysis of data to identify meaningful patterns.

So while analytics is often conflated with statistics, it is indeed a broader expression – not only in terms of the nature of the data that may be analysed, but also in terms of what is done with the results.

For example, if we were to analyse the results of our weight-related statistical tests, we might recognise an obesity problem in poor neighbourhoods.

The DIKW Pyramid, showing analytics converting data into knowledge.

Thus, our application of analytics has used statistics to convert our data into information, which we have then translated into knowledge. We have climbed another step higher in the DIKW Pyramid.

Data science

My shorthand definition of data science is the combination of statistics, computer programming, and domain expertise to generate insight. Or so I’m led to believe.

Given the powerful statistical software packages currently available, I don’t see why anyone would need to resort to hand coding in R or Python. At this early stage of my re-discovery, I can only assume the software isn’t sophisticated enough to compute the specific processes that people need.

Nonetheless, if we return to our obesity problem, we can combine our new-found knowledge with existing knowledge to inform strategic decisions. For example, given we know a healthy diet and regular exercise promote weight loss, we might seek to improve the health of our fellow citizens in poor neighbourhoods (and thereby lessen the burden on public healthcare) by building sports facilities there, or by subsidising salad lunches and fruit in school canteens.

The DIKW Pyramid, showing data science converting data into wisdom.

Thus, not only has our application of data science used statistics and analytics to convert data into information and then into knowledge, it has also converted that knowledge into actionable intelligence.

In other words, data science has converted our data into wisdom. We have reached the top of the DIKW Pyramid.

The definition of Enterprise Social Network

26 August 2015

Enterprise Social Network, n. 1. A software platform that facilitates communication and collaboration among the employees of a company. 2. A means of liking senior executives' posts.

The paradox of augmented reality

27 May 2013

Sydneysider Scott O’Brien is back in town after an extended stint in San Francisco. Scott is the Co-founder & CMO of Explore Engage, a digital media company that is attracting serious attention for its augmented reality eyewear.

I caught up with Scott in the harbour city and asked him the following questions…

  • What are your favourite examples of augmented reality? (0:08)
  • What are you working on at Explore Engage? (2:20)
  • How does your eyewear differ from Google Glass? (3:05)
  • Is augmented reality worth the hype? (3:54)
  • What opportunities exist for the finance sector? (5:04)
  • What opportunities exist for workplace education? (6:14)

I was impressed with the examples of augmented reality cited by Scott.

Medical education has long been the poster boy of salivatingly engaging content, and the tradition continues with this emerging technology. Daqri’s 4D Anatomy app showcases the visualisation capabilities of the medium, while the Australian Defence Force not only targets a real-world need with their Mobile Medic app, but also incorporates it into their recruitment process.

Ingress Enlightened logo

Google’s Ingress is an augmented reality MMOG that exemplifies the gamification capability of the medium. Two factions fight for control over the real world by capturing virtual “portals” that are represented by public landmarks such as statues and fountains.

The fact that Ingress was developed by Google’s internal startup, Niantic Labs, is enlightening (excuse the pun). Augmented reality is still an emerging technology in which experiments must be undertaken and failures borne. It is by learning from the results, and responding to them via adaptation, that you increase the probability of break-through success.

I am also fascinated by Google’s marketing strategy with Ingress. The game is in “closed beta” mode, which means you need an invitation to play it. Reminiscent of Studio 54, only the members of the “in” crowd have the privilege of enjoying that which is denied to others. Google deepens the mystique by seemingly neglecting to promote the product – instead relying on organic growth of the subculture.

On the subject of Google, I think Scott’s differentiation between Google Glass and Explore Engage’s Augmented Reality Eyewear is an important one. While Google Glass has augmented reality capability, it is essentially a wearable computer with which digital information is conveniently presented in front of the wearer’s eye. In contrast, the Explore Engage eyewear is specifically designed to integrate digital information with the real-world background. There is no better example of the latter concept than BMW’s Augmented Reality Glasses – which aren’t Explore Engage’s by the way, but are oh so sexy all the same.

While I’m on my definitions soapbox, I’ll take this opportunity to point the finger at Star Chart. This is a wonderful (and free) app, but its so-called “augmented reality mode” is no such thing; it does not lay its stellar information over the night sky! In contrast, Sun Seeker lays the sun’s trajectory over the real background. In other words, it augments reality.

Money

In terms of ROI, 2.5 million downloads of Transformers 3’s Defend the Earth speaks for itself. The return on Audi’s Virtual Q3 is less obvious, but that’s because it’s less about car sales and more about engaging consumers and associating the brand with innovation. How do you evaluate that? By analysing car sales of course, after the Q3 finally lands on Aussie shores.

While the Commonwealth Bank should be applauded for their Property Guide app, which combines geolocation with big data to provide something truly useful to their prospective customers, I must say as someone in the financial services industry: the general lack of financially oriented augmented reality apps represents a typical lack of imagination in the sector. Worse still, the examples highlighted by Infosys’s whitepaper are almost exclusively home finders and ATM locators, which means they’ve merely copied each other. Yawn.

As I am concurrently in the education profession, however, I must also recognise that the potential for augmented reality remains largely untapped. Scott’s examples attest to the power of the medium in terms of visualisation, gamification and performance support – which are factors that make education in the workplace engaging and effective. So what are we waiting for?

I think the mobility of the technology also remains under exploited. For example, how about an architecture tour of your local city in which details of buildings are highlighted when you point your mobile device at them? Or even better, when you look at them through your AR-enabled glasses?

And Scott’s mention of avatars adds more fuel to the fire of possibility. I imagine learning interventions in dangerous environments (such as mining sites) in which training can be undertaken in context, minus the threat to life or limb. Unlike in a simulator or a virtual world, the training is done at the workplace.

Therein lies the paradox of augmented reality. By complementing the real world with artificiality, it makes the learning experience more authentic.

Human enough

19 February 2013

It is with glee that the proponents of e-learning trumpet the results of studies such as the US Department of Education’s Evidence-Based Practices in Online Learning: A Meta-Analysis and Review of Online Learning Studies, which found that, on average, online instruction is as effective as classroom instruction.

And who can blame them? It is only natural for evangelists to seize upon evidence that furthers their cause.

But these results mystified me. If humans are gregarious beings and learning is social, how can face-to-face instruction possibly fail to out perform its online equivalent?

That was until I watched Professor Steve Fuller’s Humanity 2.0 TEDxWarwick talk in Week 3 of The University of Edinburgh’s E-learning and Digital Cultures course.

The professor explains with wonderful articulation how difficult it is to define a human.

Sure, biologists will define humanity in terms of DNA, yet they can’t even agree on whether the Neanderthals were a subspecies of Homo sapiens or a separate species all together.

If we remove our gaze from the electron microscope, we have our morphology. Perhaps a human is an organism that has five fingers on each hand? But does that mean someone who is born with four (or six) is not human?

Perhaps a human is an organism that uses tools? Well, vultures drop rocks onto eggs to break them open.

Perhaps then a human is an organism that uses language? Whales might have something to say about that.

It is an intriguing conundrum that has occupied our thoughts since anyone can remember.

Title page of the first edition of René Descartes' Discourse on Method.

In the 17th Century, René Descartes made an intellectual breakthrough. He contended that “reason…is the only thing that makes us men, and distinguishes us from the beasts”. In other words, we are the only creatures on God’s earth capable of rational thought. I think, therefore I am.

Descartes pushed his point by arguing that while a robot might one day be developed to speak words, “it is not conceivable that such a machine should…give an appropriately meaningful answer in its presence”. And despite astonishing advances in artificial intelligence, the philosophical Frenchman remains right. Even Watson, who triumphed at Jeopardy! and today mines big data to help humans make better decisions, can not reasonably be considered a human itself. It is simply a product of computer programming.

Speaking of machines, if a human were to progressively replace her body parts with robotics – hence becoming a cyborg – at what point does she cease to be a human? According to the humanist tradition of Descartes, the absolute difference between a human and a non-human is a property of the mind. So, arguably she will remain a “human” until her brain is replaced.

But that begs the question: if we flip the scenario around and place a person’s brain in a robot’s body, does that make it a human?

All this philosophy starts to do my head in after a while, and that’s before getting into Freud’s posthumanism.

Somehow I prefer Joseph Gliddon’s simpler definition of a human: something that drinks coffee.

Cup of coffee

It’s not as flippant as it sounds, for it is our artificial enhancements that paradoxically make us more human.

Riding a bicycle, for example, is a quintessentially human endeavour. No other creature does it. Yes, a monkey might do so in the circus, but the reason we find it funny (or at least unusual) is because it doesn’t normally do that. The poor thing is mimicking a human.

Similarly, digital technology is an extension of our notion of humanity. Humans are the only organisms that use computers, surf the Web, write text, film video, record audio, and engage with one another in online discussion forums.

So when we view online pedagogy through this lens, we recognise very little of it that is not human. Consequently the strong performance of online students becomes less mysterious. In fact, it becomes expected because, just as a bicycle enhances our capability for travel, digital technology enhances our capability for learning.

This expectation is supported by a further finding of the Department of Education’s research – namely, that “blends of online and face-to-face instruction, on average, had stronger learning outcomes than did face-to-face instruction alone”. In other words, students who had the technology via the blended design performed better than those who didn’t.

But it doesn’t work in reverse: “the majority of…studies that directly compared purely online and blended learning conditions found no significant differences in student learning”. In other words, those who had the face-to-face interaction via the blended design performed no better than those who didn’t. Apparently the online instruction was human enough.

OK, on that bombshell, I think I’ll ride my bike to the cafe and pick up a cup of joe…