Using Speech to Text Software to Transcribe Primary Sources?

This post is really me taking an off the wall idea and bouncing it off of fellow Civil War buffs.  Over at The Siege of Petersburg Online, I’ve got tons of newspaper articles collected which are in PDF or image form.  My challenge is to get those items transcribed for use on my site, either by transcribing myself (happens sometimes), having transcribers do this for me as volunteers (this is going on almost daily), or by some other method.

It’s the “some other method” I’ve been thinking about a LOT lately.  I’d like to take the plunge and purchase voice to text software, probably Dragon NaturallySpeaking 12 by Nuance, to literally read old newspaper articles out loud and have the program transcribe them for me.  I have some doubts as to whether or not this is feasible, but some of the reviews lead me to believe this is possible.  More than one person used their Dragon software to actually dictate the reviews, and many are decently lengthy.

So I ask readers, have any of you used Voice Recognition software?  If so, could it be used in the manner I describe above?  Taking it further, have any of you tried to use this software for Civil War purposes?  I’d like to hear anyone’s thoughts who has input on whether or not this might work.  If so, I could do multiple transcriptions a night in about a third of the time it takes to do it the old fashioned way.  My kids go to bed around 8, so I don’t have much “me time” at night.  Anything which improves the time needed to do this would help me tremendously.

Comments

11 responses to “Using Speech to Text Software to Transcribe Primary Sources?”

  1. Bummer Avatar

    Brett,
    Bummer’s wife has been pushing the “Dragon” software for a couple of months, so it looks like the “old guy” is going to give it a shot.Will let you know if it works out, maybe something can be worked out for your benefit. Keep in touch on any ideas.

    Bummer

  2. Geoff Rothwell Avatar
    Geoff Rothwell

    I do not have first hand experience, but some of the kids at the school where I teach use Dragon Naturally Speaking. What I do know is that Dragon can take a lot of time to “train” i.e. to get used to your voice. If you are willing to put in the hours (and it will take several hours from what I have seen) then I don’t see why it couldn’t work for your plan. Oh yeah, and once you’ve got it trained, don’t read to Dragon if you’ve got a cold. It will think you’re a different person!

  3. Frank, Thomas W Avatar
    Frank, Thomas W

    I use Dragon software to dictate primary sources frequently. Especially deeds for genealogical work, journals and newspaper articles. It is particularly useful for dictating from documents you have to hold in your hands while reading. HOWEVER… you definitely have to proofread and you can expect to invest a bit of time getting the system to recognize your speech and you will have to learn the ins and outs of customizing it sot that the dictated speech appears the way you want it to appear (1st vs first, 1 thousand vs 1,000 vs one thousand … etc)
    Once you have it customized and have learned how to use the software…. you will love it. Expect some typos.. and always proofread. They typos will be hidden… “to” instead of “too” or “two” ….. “Rocks and a” .. instead of “Roxanna” etc…
    Good luck!
    Tom

  4. Frank, Thomas W Avatar
    Frank, Thomas W

    Clearly I should have proofread the above message before hitting “submit”… but it can even be more perilous with voice recognition software. On the balance… the pros outweigh the cons in my view.
    Tom

  5. Jmnlman Avatar

    It’s possible in theory to do what you describe. The biggest problem you’re going to find is that text will have to be edited after you do it. Actually using Siri on the iPhone to do this comment right now. Proper names and names of locations probably won’t be in the database. You may find it’s faster to type it out by hand. Sometimes the editing can be slow to do. You won’t be able to dictate for several hours straight. The voice does tend to go then dictation quality goes way down.

  6. Joe Avatar
    Joe

    Acrobat Reader X can save pdf files as text and if online in Word. I have the full version of Acrobat 9 and can save a pdf file in rich or pain text in additional to several other formats.
    Haven’t used Dragon Speaking, but have heard that it does a decent job. You will need spend some time to train it to your voice and expect to do some editing. My Android phone can do speech to text for messaging fairly well right out of the box with some editing. .

  7. Fred Ray Avatar
    Fred Ray

    Brett,

    I tried it a few years ago with some of the Blackford letters — it worked ok but not great and I found it faster to do it the old-fashioned way. However, friends tell me that the software has improved a great deal since then.

  8. Charles Purvis Avatar

    I’m using Dragon 12 an.sd it works great. As.S stated above you will have to proof read afterward to clean up the document and make it accurate.
    Charlie

  9. Brett Schulte Avatar

    Thanks for all of the comments everyone! I had a family Christmas today so I haven’t had a chance to see all of the great feedback here. Absolutely I expect to spend hours and more if needed to train the software to my voice. I’m also willing to put in the effort to get the software to dictate to my preference (1st vs first, etc.). One point made above about proper place names is something I’ve wondered about. If I say “Rappahannock” enough times can I train the software to spit it out and spell it correctly? And how long will I have to train it to spit out Taliaferro if I say “Tolliver?” 😉

  10. Bummer Avatar

    Brett,

    The kids gave Bummer the “Dragon” package with headphone and accoutrements for christmas, the “old guy” is not as tech savvy as you, wish Bummer luck, will keep you posted. Merry Christmas to you and yours.

    Bummer

  11. Michael Weeks Avatar

    Brett, I just sent you an email about this – my wife’s already “trained” her software, so we can try something out there. Forgot to mention, though, that she has the same version you mention – NaturallySpeaking 12.

    She’s writing resumes with it, with full formatting and little work afterward, and swears by it.

Leave a Reply

Your email address will not be published. Required fields are marked *