Wanna Chat With Your DVR?

Dave Zatz —  May 1, 2013 — 11 Comments

directv-voice-search

According to Variety, DirecTV has been working on a Nuance-powered iPhone app update to bring speech recognition to HR24 and newer set-top boxes. My initial reaction was that it’s nothing more than a clever, but not very practical, application of Siri-like skills. Yet, upon reflection, being able to change channel via station name, rather than researching a corresponding number I probably don’t know, seems quite compelling. Natural language interactivity might even come in handy when attempting to determine when a given show airs. However, I don’t imagine voice control would be the most precise or efficient way to schedule and manage DVR recordings and I’m not particularly interested in finding “a Tom Cruise movie this weekend.”

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

11 responses to Wanna Chat With Your DVR?

  1. I don’t even use voice control on my phone so I can’t imagine using it to control a DVR.

  2. I tend to agree with Brennok. I find second screen apps for controlling my DVR(s) more appealing than voice command.

  3. Well, it all depends how well it works doesn’t it?

    If you could flawlessly say “record every episode of Battlestar Galactica after that one where they have to keep jumping every 15 minutes or something” then hey, that would be great.

    As you suggest though a more realistic hope would be for speaker-independent recognition of the channel name rather than number. So I could say “Tune to A&E” rather than hunting through the channel guide trying to remember where it is. I’ve disabled almost all of the SD channels, and a lot of the HD ones too and yet since I watch mostly things I’ve already recorded I have little knowledge of what channel numbers are for anything other than the big four.

    Honestly though, I’d really rather say “tune to New Girl” and have that work. But it probably won’t. At least not reliably enough to be something you use all the time.

    Why won’t somebody just let me change the channel guide so it shows me whats on sorted by how likely I am to want to watch it, rather than some arbitrary channel numbering? That would be a lot more useful than this.

  4. “If you could flawlessly say “record every episode of Battlestar Galactica after that one where they have to keep jumping every 15 minutes or something” then hey, that would be great.”

    Your DVR responds:

    “Would you like me to search the web for Battlestar Galactica?”

  5. This has been some weird version of the future for 30 years. “Well, just talk to our computers and stuff will magically happen…”

    It reminds of things like 3D movies being some Holy Grail of the film industry.

    The reality: People don’t want to talk to their computers or phones. Sure, as you’ve seen with Siri, when something new is introduced people will do it for the novelty but that novelty soon wears off and they just want to push a button and make something happen.

    The other things that get in the way are:
    - We don’t want a lot of chatter around us with people speaking commands into a mic for everything that they do.
    - It’s always a little weird to talk to a computer, though I’ll admit that some of the CSR computer recognition stuff is pretty amazing – still, I wouldn’t be there if I could accomplish what I wanted on their website.
    - Context is tough. Really tough.
    - What people say and what people mean are often different things. Add to that that when people say one thing, they often forget about the many options because they already have it in their mind the thing that they want. For instance, if you go to the grocery store for orange juice you think, “I’m going to the grocery store for orange juice,” and if you were ordering via voice you’d say, “I’d like some orange juice delivered, please.” Ok, then you have to traverse: brand, pulp amount, container size, vitamin C added, vitamin D added, any sales that would affect your purchase, etc. When you’re at the store you do this in a matter of moments. Being hit with one question after another will eventually lead you do, “I just want some freaking OJ – I’ll do it myself.”

    I suspect that it’ll be like Chucky wrote, though, if it doesn’t instantly know what you’re talking about it’ll offer to do something useless for you.

  6. “This has been some weird version of the future for 30 years. “Well, just talk to our computers and stuff will magically happen…”

    It’s all Kubrick’s fault. We’ve been trained to want our conversing computers to be highly skillful psychopathic killers, and now we’re disappointed to find that they’re just harmless incompetents instead.

    However, if someone would come up with a DVR that would kill everyone in the house with lasers if an occupant voice-requested the recording of lots of reality shows, that’d sell like hot cakes…

  7. Voice controls? This is where the R&D money is going?

    The improvements that more people want in DVRs are: more storage space; better recommendation “engine”; and more tuners.

  8. “What people say and what people mean are often different things. Add to that that when people say one thing, they often forget about the many options because they already have it in their mind the thing that they want. For instance, if you go to the grocery store for orange juice you think, “I’m going to the grocery store for orange juice,” and if you were ordering via voice you’d say, “I’d like some orange juice delivered, please.” Ok, then you have to traverse: brand, pulp amount, container size, vitamin C added, vitamin D added, any sales that would affect your purchase, etc. When you’re at the store you do this in a matter of moments. Being hit with one question after another will eventually lead you do, “I just want some freaking OJ – I’ll do it myself.”

    Speech and voices, both internal monologue and interpersonal dialog, are deeply weird and spooky. Very late bolt-on to the toolkit in human evolution.

    I’ve always thought Julian Jaynes’ The Origin of Consciousness in the Breakdown of the Bicameral Mind had a lot of truth to it, even if you can argue the details.

    And what that all means is that, until computers can reliably pass a vocal Turing Test, point and click UI’s, or current ten-foot UI’s, or some other precision control UI will always be less freaking trouble in getting the OJ.

    Until we pass that Turing Test point, voices in the uncanny valley will always be more befuddling, unsettling, and freaking trouble than useful.

    —–

    “Yet, upon reflection, being able to change channel via station name, rather than researching a corresponding number I probably don’t know, seems quite compelling.”

    If someone had the good sense to just implement that, implement it skillfully, and not significantly raise costs in the process, it’d be a win. But somehow, that’s not how I see the marketing/accounting/development process playing out…

  9. cypherstream May 3, 2013 at 7:39 pm

    I’ll use it for search. Really typing in search criteria is a real pain. You can triple tap the number keys on the remote, but still that takes forever when searching in YouTube or SmartSearch.

    I wish they had a slide out QWERTY remote… or heck use the iPhone/iPad/android app to send QWERTY keystrokes to the receivers.

    Even better, give it YouTube TV pairing support. That works excellent on my xbox. Search Youtube on my iphone, hit the little icon and it sends it to xbox.

  10. i wasn’t very impressed.

Leave a Reply

*

Text formatting is available via select HTML.

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>