Question
Will Google+ Hangout significantly change the way we communicate?
Answer
I wrote that post in 2008 as you note. Since then, I've gotten a bit skeptical of the ambient presence idea since I haven't actually seen it catch on anywhere. Webcams never got that popular outside the porn/voyeurism industry. I saw plenty of demos of things like speaker-following cameras a few years ago, but I don't see many around.
Cisco's wall-sized telepresence stunts have kinda remained rarely-used stunts. Mostly for when you want to put on a show. And I haven't heard of the ambient presence couple connection catching on either. I've never used it despite being in the remote situation for several years.
So despite the ability to do ambient presence cheaply, we still do "appointment presence" or "on-demand presence" so to speak, and tend to default to the lowest bandwidth medium that will do the job. Is it just adoption delay, or something more fundamental? I think it's something more fundamental. I don't use the phone if I can email. I don't call if I can text. In web conferences I turn my camera off unless someone specifically asks to look at my face, and can pull rank on me and make me do it. I use the chat in Skype more often than I use audio.
The NYT had a piece about how people now think using the phone is rude if you don't need to. I totally agree:
http://www.nytimes.com/2011/03/2...
In fact the only kind of ambient presence that has caught on in even a minor way is Facebook wall updates/Twitter.
So I think we're missing something about the psychology of collaboration. Until I understand that, I won't speculate on the future of Google's shiny new toys, but I will propose the following new conjectures:
So in a sense I have gone 180 degrees in my views. Initially I used to think richer modalities were about more sensory experience and that people desire sensory experience for its own sake. But the more I thought about it, the less sense this made.
If we were actually optimizing for sensory experience in a brain-in-a-vat full Matrix illusion sense, today's tech is a bad joke. The difference between text vs. the best of today's immersive, holographic 3d, on a scale where being there in real life is 100%, is like 0.1% vs. 0.2% (I don't mean this as a rhetorical percentage; I suspect if you did the analysis of bit rates of full sensory brain-in-a-vat experience, they WILL be orders of magnitude higher, I vaguely recall some estimate like this in Denett's Consciousness Explained).
Another way to think of this is as follows: I find digital versions of famous paintings or 3d walkthroughs of famous buildings quite satisfying. But I find even the best digital recreations of nature (say, even a simple walk in the woods) completely unsatisfying with respect to the real thing. This means that human created environments have very very low information content compared to full-blown nature.
So why are we even bothering with this 0.1% to 0.2% upgrade? The only meaningful extra stuff it buys us is emotional communication, which is actually fairly low bandwidth. If I recall correctly, reading body language or facial expressions is about a fairly small bit-rate... much higher than verbal, like 5-6 times higher, but still, nothing compared to full-blown sensory, brain-in-a-vat bitrates.
And when is it ever great for BOTH parties to want this extra emotional communication bit rate? Certainly not most business situations, where we all want the low bandwidth so we can backchannel, be unshaven, work in pajamas etc. There is too much of an adversarial element and misaligned motives. The sales guy may want to see his client because he can manipulate better. The client may want to avoid being seen for exactly the same reason. Why do poker players wear dark glasses?
Probably only very intimate communication acts with near zero conflict and a very noisy text channel qualify. This means even most couples don't qualify.
So ambient presence with a baby or pet is likely a great feeling. Or a doctor with a patient. Or a military unit coordinating a real-time attack with shared situation awareness and verbal silence. Or some future augmented reality MMPORG where you play immersively in a faux-real environment instead of with avatars in a game environment.
These are outlier scenarios.
So my prediction: the everyday stuff will stay textual. Google Hangout will be little used in ambient ways.
Cisco's wall-sized telepresence stunts have kinda remained rarely-used stunts. Mostly for when you want to put on a show. And I haven't heard of the ambient presence couple connection catching on either. I've never used it despite being in the remote situation for several years.
So despite the ability to do ambient presence cheaply, we still do "appointment presence" or "on-demand presence" so to speak, and tend to default to the lowest bandwidth medium that will do the job. Is it just adoption delay, or something more fundamental? I think it's something more fundamental. I don't use the phone if I can email. I don't call if I can text. In web conferences I turn my camera off unless someone specifically asks to look at my face, and can pull rank on me and make me do it. I use the chat in Skype more often than I use audio.
The NYT had a piece about how people now think using the phone is rude if you don't need to. I totally agree:
http://www.nytimes.com/2011/03/2...
In fact the only kind of ambient presence that has caught on in even a minor way is Facebook wall updates/Twitter.
So I think we're missing something about the psychology of collaboration. Until I understand that, I won't speculate on the future of Google's shiny new toys, but I will propose the following new conjectures:
- For any communication need, be it foreground, ambient or both, people will use the lowest bandwidth, minimum modality, most asynchronous channel (let's call this channel the minimal channel) that will do the job. In most cases, this is asynchronous text.
- For any communication need, the minimal channel is determined by the emotional intensity of the communication need. You escalate from text to audio to video to rich immersive 3D video to whatever future olfactory/tactile things might be available based on how much you need emotion data bits.
- For ordinary, non-intimate interactions, text/visual symbols (via emoticons) has more than enough bandwidth to accommodate the emotional range.
So in a sense I have gone 180 degrees in my views. Initially I used to think richer modalities were about more sensory experience and that people desire sensory experience for its own sake. But the more I thought about it, the less sense this made.
If we were actually optimizing for sensory experience in a brain-in-a-vat full Matrix illusion sense, today's tech is a bad joke. The difference between text vs. the best of today's immersive, holographic 3d, on a scale where being there in real life is 100%, is like 0.1% vs. 0.2% (I don't mean this as a rhetorical percentage; I suspect if you did the analysis of bit rates of full sensory brain-in-a-vat experience, they WILL be orders of magnitude higher, I vaguely recall some estimate like this in Denett's Consciousness Explained).
Another way to think of this is as follows: I find digital versions of famous paintings or 3d walkthroughs of famous buildings quite satisfying. But I find even the best digital recreations of nature (say, even a simple walk in the woods) completely unsatisfying with respect to the real thing. This means that human created environments have very very low information content compared to full-blown nature.
So why are we even bothering with this 0.1% to 0.2% upgrade? The only meaningful extra stuff it buys us is emotional communication, which is actually fairly low bandwidth. If I recall correctly, reading body language or facial expressions is about a fairly small bit-rate... much higher than verbal, like 5-6 times higher, but still, nothing compared to full-blown sensory, brain-in-a-vat bitrates.
And when is it ever great for BOTH parties to want this extra emotional communication bit rate? Certainly not most business situations, where we all want the low bandwidth so we can backchannel, be unshaven, work in pajamas etc. There is too much of an adversarial element and misaligned motives. The sales guy may want to see his client because he can manipulate better. The client may want to avoid being seen for exactly the same reason. Why do poker players wear dark glasses?
Probably only very intimate communication acts with near zero conflict and a very noisy text channel qualify. This means even most couples don't qualify.
So ambient presence with a baby or pet is likely a great feeling. Or a doctor with a patient. Or a military unit coordinating a real-time attack with shared situation awareness and verbal silence. Or some future augmented reality MMPORG where you play immersively in a faux-real environment instead of with avatars in a game environment.
These are outlier scenarios.
So my prediction: the everyday stuff will stay textual. Google Hangout will be little used in ambient ways.