Show HN: Open-source macOS AI copilot using vision and voice
424 by ralfelfving | 154 comments
Heeey! I built a macOS copilot that has been useful to me, so I open sourced it in case others would find it useful too. It's pretty simple: - Use a keyboard shortcut to take a screenshot of your active macOS window and start recording the microphone. - Speak your question, then press the keyboard shortcut again to send your question + screenshot off to OpenAI Vision - The Vision response is presented in-context/overlayed over the active window, and spoken to you as audio. - The app keeps running in the background, only taking a screenshot/listening when activated by keyboard shortcut. It's built with NodeJS/Electron, and uses OpenAI Whisper, Vision and TTS APIs under the hood (BYO API key). There's a simple demo and a longer walk-through in the GH readme https://ift.tt/JTdoLhy , and I also posted a different demo on Twitter: https://twitter.com/ralfelfving/status/1732044723630805212
Subscribe to:
Post Comments (Atom)
New exponent functions that make SiLU and SoftMax 2x faster, at full accuracy
New exponent functions that make SiLU and SoftMax 2x faster, at full accuracy 379 by weinzierl | 72 comments
-
Boards are dangerous to founder/CEOs 574 by tosh | 264 comments
-
Samsung plans $17B chip plant in Taylor, Texas 515 by kungfudoi | 370 comments
-
Stepping Back from Speaking 502 by alfredbez | 124 comments
No comments:
Post a Comment