Microsoft open-sourced a Python tool for converting files and office documents to Markdown

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 3 days ago

Microsoft open-sourced a Python tool for converting files and office documents to Markdown

utopiah@lemmy.ml · 1 day ago

Thanks for the clarification. I checked the code you linked and noticed recognize_google and seems it’s relying on https://github.com/Uberi/speech_recognition which then seems to rely on https://github.com/Uberi/speech_recognition/blob/master/speech_recognition/recognizers/google.py so basically are they using an API, sending all the audio data to Google servers?

django@discuss.tchncs.de · 1 day ago

Yes, this is how I read it as well. The library would support to use a local model, but they decided to just send the audio data to Google.

utopiah@lemmy.ml · 1 day ago

Might open up a GDPR related issue there. I don’t think people using such a library assume they need connectivity nor that their data would be send to a 3rd party.

Microsoft open-sourced a Python tool for converting files and office documents to Markdown

Microsoft open-sourced a Python tool for converting files and office documents to Markdown

GitHub - microsoft/markitdown: Python tool for converting files and office documents to Markdown.