Indlela yokwakha i-DIY Assistant AI nge-Raspberry Pi

Indlela yokwakha i-DIY Assistant AI nge-Raspberry Pi

Ufuna umsizi wezwi omncane olandela umholi wakho, osebenza ngehadiwe yakho, futhi ongeke ngephutha u-ode amaphayinaphu ayishumi nambili ngoba engakuzwi kahle? Umsizi we-AI we-DIY one-Raspberry Pi uyinto efinyelelekayo ngokumangazayo, ejabulisayo, futhi eguquguqukayo. Uzoxhumanisa igama elivusayo, ukuqashelwa kwenkulumo (i-ASR = ukuqashelwa kwenkulumo okuzenzakalelayo), ubuchopho bolimi lwemvelo (imithetho noma i-LLM), kanye nombhalo-kuya-enkulumweni (i-TTS). Engeza izikripthi ezimbalwa, isevisi eyodwa noma ezimbili, kanye nokulungiswa komsindo okucophelelayo, futhi unesikhulumi esihlakaniphile esikwazi ukulalela imithetho yakho.

Ake sikususe kusukela ku-zero kuya ekukhulumeni ne-Pi yakho ngaphandle kokudonsa izinwele okuvamile. Sizomboza izingxenye, ukusetha, ikhodi, ukuqhathanisa, ama-gotchas... yonke i-burrito. 🌯

Izihloko ongase uthande ukuzifunda ngemva kwalesi:

🔗 Indlela yokufunda i-AI ngempumelelo
Dala imephu yokufunda, prakthiza amaphrojekthi, bese ulandela inqubekela phambili.

🔗 Ungaqala kanjani inkampani ye-AI
Qinisekisa inkinga, yakha i-MVP, hlanganisa iqembu, uvikele amakhasimende okuqala.

🔗 Indlela yokusebenzisa i-AI ukuze ukhiqize kakhudlwana
Yenza imisebenzi ejwayelekile isebenze ngokuzenzakalelayo, yenza lula imisebenzi yokusebenza, futhi wandise umphumela wokudala.

🔗 Indlela yokufaka i-AI ebhizinisini lakho
Thola izinqubo ezinomthelela omkhulu, sebenzisa ama-pilot, ulinganise i-ROI, isikali.


Yini Eyenza Umsizi We-AI Ohle We-DIY nge-Raspberry Pi ✅

  • Kuyimfihlo ngokuzenzakalelayo – gcina umsindo usendaweni lapho kungenzeka khona. Nguwe onquma ukuthi yini ephuma kudivayisi.

  • ze-Modular - shintsha njenge-Lego: i-wake word engine, i-ASR, i-LLM, i-TTS.

  • Ingabizi kakhulu – ikakhulukazi umthombo ovulekile, imakrofoni yezimpahla, izipikha, kanye ne-Pi.

  • iyabanjwa – ufuna ukuzenzakalela ekhaya, amadeshibhodi, izindlela zokusebenza, amakhono enziwe ngokwezifiso? Kulula.

  • Ithembekile - ilawulwa yisevisi, iqala ukulalela ngokuzenzakalelayo.

  • Kumnandi – uzofunda okuningi ngomsindo, izinqubo, kanye nomklamo oqhutshwa yimicimbi.

Icebiso elincane: Uma usebenzisa i-Raspberry Pi 5 futhi uhlela ukusebenzisa amamodeli endawo anzima, i-clip-on cooler iyasiza lapho umthwalo uqhubeka isikhathi eside. (Uma ungabaza, khetha i-Active Cooler esemthethweni eyenzelwe i-Pi 5.) [1]


Izingxenye Namathuluzi Ozowadinga 🧰

  • I-Raspberry Pi : I-Pi 4 noma i-Pi 5 inconywa ukuthi isetshenziswe ekhanda.

  • Ikhadi le-microSD : Kunconywa i-32 GB+.

  • Imakrofoni ye-USB : imakrofoni elula ye-USB conference inhle kakhulu.

  • Isipikha : Isipikha se-USB noma esingu-3.5 mm, noma i-amp HAT ye-I2S.

  • Inethiwekhi : I-Ethernet noma i-Wi-Fi.

  • Izinto ezinhle ongazikhetha: ikesi, i-cooler esebenzayo ye-Pi 5, inkinobho yokucindezela ukuze ucindezele ukuze ukhulume, indandatho ye-LED. [1]

Ukusethwa kwe-OS kanye ne-Baseline

  1. I-Flash Raspberry Pi OS ene-Raspberry Pi Imager. Kuyindlela eqondile yokuthola i-microSD ebhuthwayo ngamasethingi owafunayo. [1]

  2. Qala, xhuma kunethiwekhi, bese ubuyekeza amaphakheji:

ukuvuselelwa kwe-sudo apt && ukuvuselelwa kwe-sudo apt -y
  1. Izisekelo zomsindo : Ku-Raspberry Pi OS ungasetha okukhiphayo okuzenzakalelayo, amazinga, namadivayisi nge-UI yedeskithophu noma i-raspi-config . Umsindo we-USB ne-HDMI usekelwa kuwo wonke amamodeli; okukhiphayo kwe-Bluetooth kuyatholakala kumamodeli ane-Bluetooth. [1]

  2. Qinisekisa amadivayisi:

irekhodi -l ukudlala -l

Bese uhlola ukuthwebula nokudlala. Uma amazinga ebonakala engajwayelekile, hlola ama-mixer kanye nokuzenzakalelayo ngaphambi kokusola imakrofoni.

 

I-AI raspberry pi

Ukwakhiwa Kwazo Ngokushesha 🗺️

ohlakaniphile we-DIY one-Raspberry Pi flow ubukeka kanje:

Ukuvuka kwezwi → ukuthwebula umsindo bukhoma → ukubhalwa kwe-ASR → ukuphathwa ngenhloso noma i-LLM → umbhalo wokuphendula → i-TTS → ukudlala umsindo → izenzo zokuzikhethela nge-MQTT noma i-HTTP.

  • Izwi Lokuvuka : I-Porcupine incane, inembile, futhi isebenza endaweni ngokulawula ukuzwela kwegama elingukhiye ngalinye. [2]

  • I-ASR : I-Whisper iyimodeli ye-ASR esebenzisa izilimi eziningi, enenhloso ejwayelekile eqeqeshwe amahora angama-~680k; inamandla ekukhulumeni/umsindo wangemuva. Ukusetshenziswa kudivayisi, i-whisper.cpp inikeza indlela yokuphetha ye-C/C++ elula. [3][4]

  • Ubuchopho : Ukukhetha kwakho - i-LLM yamafu nge-API, injini yemithetho, noma isiphetho sendawo kuye ngamandla ehhashi.

  • I-TTS : I-Piper ikhiqiza inkulumo yemvelo endaweni, ngokushesha okwanele ukuze iphendule ngokushesha kuhadiwe encane. [5]


Ithebula Lokuqhathanisa Okusheshayo 🔎

Ithuluzi Okuhle Kakhulu Kwaba Intengo-ngokufanayo Kungani Kusebenza
Izwi Lokuvuka Kwengungumbane Isiqalisi esilalela njalo Izinga lamahhala + I-CPU ephansi, ukubopha okunembile, okulula [2]
I-Whisper.cpp I-ASR Yendawo ku-Pi Umthombo ovulekile Ukunemba okuhle, kuyavumelana ne-CPU [4]
Ukuhleba Okusheshayo I-ASR esheshayo ku-CPU/GPU Umthombo ovulekile Ukulungiswa kwe-CBtranslate2
I-Piper TTS Umphumela wenkulumo yendawo Umthombo ovulekile Amazwi asheshayo, izilimi eziningi [5]
Cloud LLM API Ukucabanga okucebile Kusekelwe ekusetshenzisweni Ilayisha i-compute esindayo
I-Node-RED Izenzo zokuhlela Umthombo ovulekile Ukugeleza okubonakalayo, okuhambisana ne-MQTT

Ukwakhiwa Kwesinyathelo Ngesinyathelo: I-Voice Loop Yakho Yokuqala 🧩

Sizosebenzisa i-Porcupine njengegama lokuvuka, i-Whisper njengegama lokubhala, umsebenzi "wobuchopho" olula wempendulo (faka i-LLM oyikhethile), kanye ne-Piper njengegama lenkulumo. Gcina kuncane, bese uphinda.

1) Ukufaka okuxhomeke kukho

ukufaka i-sudo apt -y python3-pip portaudio19-dev sox ffmpeg pip3 ukufaka i-sounddevice numpy
  • I-Porcupine: thatha i-SDK/ama-bindings olimini lwakho bese ulandela ukuqala okusheshayo (ukhiye wokufinyelela + uhlu lwamagama angukhiye + ozimele bomsindo → .process ). [2]

  • I-Whisper (evumelana ne-CPU): build whisper.cpp :

git clone https://github.com/ggml-org/whisper.cpp cd whisper.cpp && cmake -B build && cmake --build build -j ./models/download-ggml-model.sh base.en ./build/bin/whisper-cli -m ./models/ggml-base.en.bin -f your.wav -otxt

Lokhu okungenhla kubonisa ukuqala okusheshayo kwephrojekthi. [4]

Uthanda i-Python? i-faster-whisper (i-CBtranslate2) ivame ukuba mnandi kakhulu kune-vanilla Python kuma-CPU aphansi.

2) Setha i-Piper TTS

git clone https://github.com/rhasspy/piper cd piper make # Landa imodeli yezwi oyithandayo, isib. en_US-amy echo "Sawubona lapho." | ./piper --model voices/en/en_US-amy-medium.onnx --output_file hello.wav aplay hello.wav

I-Piper yenzelwe i-TTS ekudivayisi enezinketho eziningi zezwi/ulimi. [5]

3) Iluphu yomsizi encane ku-Python

Ihlanganiswe ngamabomu: ilinda umusho wokuvuka (i-stub), iqopha, ibhale nge -whisper.cpp , ikhiqiza impendulo (indawo), bese ikhuluma nge-Piper. Shintsha indawo nge-LLM yakho oyithandayo noma i-rule logic.

ngenisa i-os, i-subprocess, i-wave ngenisa idivayisi yomsindo njenge-sd WAKE_WORD = "hey computer" # swap for Porcupine in production [2] RECORD_SECONDS = 6 SAMPLE_RATE = 16000 CHANNELS = 1 WORKDIR = "/home/pi/assistant" ASR_BIN = "/home/pi/whisper.cpp/build/bin/whisper-cli" # [4] ASR_MODEL = "/home/pi/whisper.cpp/models/ggml-base.en.bin" PIPER_BIN = "/home/pi/piper/build/piper" # [5] PIPER_VOICE = "/home/pi/piper/voices/en/en_US-amy-medium.onnx" os.makedirs(WORKDIR, exist_ok=True) def record_wav(path, seconds=RECORD_SECONDS): audio = sd.rec(int(seconds * SAMPLE_RATE), samplerate=SAMPLE_RATE, channels=CHANNELS, dtype='int16') sd.wait() with wave.open(path, 'wb') as w: w.setnchannels(CHANNELS); w.setsampwidth(2); w.setframerate(SAMPLE_RATE) w.writeframes(audio.tobytes()) def transcribe(path): cmd = [ASR_BIN, "-m", ASR_MODEL, "-f", path, "-otxt"] subprocess.run(cmd, check=True, cwd=WORKDIR) with open(path.replace(".wav", ".txt"), "r", encoding="utf-8") as f: return f.read().strip() def generate_reply(prompt): if "weather" in prompt.lower(): return "Angikwazi ukubona amafu, kodwa kungase kube kuhle. Letha ijakethi uma kwenzeka." buyisela "Uthe: " + prompt def speak(text): proc = subprocess.Popen([PIPER_BIN, "--model", PIPER_VOICE, "--output_file", f"{WORKDIR}/reply.wav"], stdin=subprocess.PIPE) proc.stdin.write(text.encode("utf-8")); proc.stdin.close(); proc.wait() subprocess.run(["aplay", f"{WORKDIR}/reply.wav"], check=True) print("Umsizi ulungile. Thayipha umusho wokuvuka ukuze uwuhlole.") ngenkathi iQiniso: typed = input("> ").strip().lower() uma uthayiphe == WAKE_WORD: wav_path = f"{WORKDIR}/input.wav" record_wav(wav_path) text = transcribe(wav_path) reply = generate_reply(text) print("Umsebenzisi:", text); phrinta ("Umsizi:", phendula) khuluma(phendula) okunye: phrinta ("Thayipha umusho wokuvuka ukuze uhlole iluphu.")

Ukuze uthole ukuvuka kwangempela, hlanganisa i-Porcupine's streaming detector (i-CPU ephansi, ukuzwela kwegama elingukhiye ngalinye). [2]


Ukuhlela Umsindo Okubaluleke Ngempela 🎚️

Ukulungiswa okuncane kwenza umsizi wakho azizwe ehlakaniphe kakhulu nge-10×:

  • Ibanga lemakrofoni : 30–60 cm liyindawo emnandi kuma-microphone amaningi e-USB.

  • Amazinga : gwema ukunqamula kokufaka futhi ugcine ukudlala kuhlelekile; lungisa umzila ngaphambi kokuxosha ama-code ghosts. Ku-Raspberry Pi OS, ungaphatha idivayisi yokukhipha kanye namazinga ngamathuluzi esistimu noma i-raspi-config . [1]

  • Imisindo yegumbi : izindonga eziqinile zibangela ukuzwakala; umata othambile ngaphansi kwemakrofoni uyasiza.

  • Umkhawulo wamagama okuvuka : ukuzwela kakhulu → izisusa zezipoki; ukuqinela kakhulu → uzobe umemeza ipulasitiki. I-Porcupine ikuvumela ukuthi ulungise ukuzwela ngegama elingukhiye ngalinye. [2]

  • Ama-Thermals : ukubhalwa okude ku-Pi 5 kuzuza ku-cooler esebenzayo esemthethweni ukuze kusebenze kahle. [1]


Ukusuka Kuthoyizi Kuya Kumishini: Izinsizakalo, Ukuqala Ngokuzenzakalelayo, Ukuhlolwa Kwezempilo 🧯

Abantu bayakhohlwa ukusebenzisa izikripthi. Amakhompyutha ayakhohlwa ukuba muhle. Guqula i-loop yakho ibe yisevisi ephethwe:

  1. Dala iyunithi yesistimu:

[Iyunithi] Incazelo=DIY Voice Assistant After=network.target sound.target [Isevisi] Umsebenzisi=pi WorkingDirectory=/home/pi/assistant ExecStart=/usr/bin/python3 /home/pi/assistant/assistant.py Qala kabusha=njalo Qala kabushaSec=3 [Faka] WantedBy=multi-user.target
  1. Yinike amandla:

sudo cp assistant.service /etc/systemd/system/ sudo systemctl daemon-reload sudo systemctl enable --now assistant.service
  1. Imisila yezingodo:

journalctl -u umsizi -f

Manje iqala lapho iqalisa, iqala kabusha lapho iphahlazeka, futhi ngokuvamile isebenza njengomshini wokusebenza. Kuyisicefe kancane, kungcono kakhulu.


Uhlelo Lwamakhono: Lwenze Lube Lusizo Ekhaya 🏠✨

Uma ukuzwakala kwezwi kanye nokuzwakala kwezwi sekuqinile, engeza izenzo:

  • I-router yenhloso : imizila elula yamagama angukhiye emisebenzi evamile.

  • I-Smart home : shicilela imicimbi ku-MQTT noma ushayele ama-HTTP endpoints e-Home Assistant.

  • Ama-plugin : imisebenzi ye-Python esheshayo efana ne-set_timer , what_is_the_time , play_radio , run_scene .

Ngisho noma une-LLM yamafu, hambisa imiyalo yendawo ecacile kuqala ukuze uthole isivinini nokuthembeka.


Usizo Lwasendaweni Kuphela Uma Luqhathaniswa Nosizo Lwamafu: Ukushintshana Okuzozizwa 🌓

Zasendaweni kuphela
: izindleko zangasese, ezingaxhunyiwe ku-inthanethi, nezibikezelwayo.
Okubi: amamodeli asindayo angase ahambe kancane emabhodini amancane. Ukuqeqeshwa kwezilimi eziningi kukaWhisper kusiza ngokuqina uma ukugcine kudivayisi noma kuseva eseduze. [3]

Usizo lwamafu
Izinzuzo: ukucabanga okunamandla, amafasitela omongo omkhulu.
Okubi: idatha ishiya idivayisi, ukuncika kwenethiwekhi, izindleko eziguquguqukayo.

I-hybrid ivame ukuwina: wake word + ASR local → shayela i-API yokucabanga → TTS local. [2][3][5]


Ukuxazulula izinkinga: Ama-Strange Gremlins kanye nokulungiswa okusheshayo 👾

  • Vusa amagama angewona amanga abangela ukuzwela : kwehlisa ukuzwela noma zama imakrofoni ehlukile. [2]

  • I-ASR lag : sebenzisa imodeli encane ye-Whisper noma wakhe i-whisper.cpp enamafulegi okukhululwa ( -j --config Release ). [4]

  • I-Choppy TTS : dala imishwana evamile kusengaphambili; qinisekisa idivayisi yakho yomsindo kanye namanani esampula.

  • Akukho mic etholakele : hlola i-arecord -l kanye nama-mixer.

  • Ukucindezela okushisa : sebenzisa i-Active Cooler esemthethweni ku-Pi 5 ukuze usebenze kahle. [1]


Amanothi Okuphepha Nobumfihlo Okufanele Uwafunde Ngempela 🔒

  • Gcina i-Pi yakho ivuselelwe nge-APT.

  • Uma usebenzisa noma iyiphi i-API yamafu, bhala phansi lokho okuthumelayo bese ucabanga ngokuhlela kabusha ama-bits akho endaweni kuqala.

  • Sebenzisa izinsizakalo ezinelungelo elincane kakhulu; gwema i-sudo ku-ExecStart ngaphandle uma kudingeka.

  • Nikeza imodi yendawo kuphela yezivakashi noma amahora okuthula.


Yakha Izinhlobo: Hlanganisa Futhi Ufanise Njengesandwich 🥪

  • I-Ultra-local : I-Porcupine + whisper.cpp + Piper + imithetho elula. Iyimfihlo futhi iqinile. [2][4][5]

  • Usizo lwefu olusheshayo : I-Porcupine + (i-Whisper encane yendawo noma i-ASR yamafu) + i-TTS yendawo + i-LLM yamafu.

  • Isikhungo sokuzenzakalela kwekhaya : Engeza ukugeleza kwe-Node-RED noma i-Home Assistant kwemisebenzi, izigcawu, kanye nezinzwa.


Ikhono Lesibonelo: Ukukhanyisa nge-MQTT 💡

ngenisa i-paho.mqtt.client njenge-mqtt MQTT_HOST = "192.168.1.10" ISIHLOKO = "ikhaya/igumbi lokuhlala/ukukhanya/isethi" def set_light(isimo: str): iklayenti = mqtt.Client() iklayenti.connect(MQTT_HOST, 1883, 60) inkokhelo = "VULIWE" uma isimo.lower().startswith("kuvuliwe") okunye "VULIWE" iklayenti.publish(ISIHLOKO, inkokhelo, qos=1, retain=False) iklayenti.disconnect() # uma "kuvula izibani" kumbhalo: set_light("kuvuliwe")

Faka umusho wezwi onjengokuthi: “vula isibani segumbi lokuphumula,” bese uzozizwa njengomthakathi.


Kungani Le Stack Isebenza Ngokwemvelo 🧪

  • I-Porcupine iyasebenza futhi inembile ekutholeni amagama avukile emabhodini amancane, okwenza ukulalela njalo kube nokwenzeka. [2]

  • Ukuqeqeshwa okukhulu kwezilimi eziningi kukaWhisper kwenza kube namandla ezindaweni ezahlukene kanye nezindlela zokukhuluma. [3]

  • i-whisper.cpp igcina lawo mandla esetshenziswa kumadivayisi e-CPU kuphela njenge-Pi. [4]

  • I-Piper igcina izimpendulo zishesha ngaphandle kokuthumela umsindo ku-TTS yamafu. [5]


Kude Kakhulu, Angikufundanga

Yakha i-DIY Assistant AI Assistant eyimfihlo ne-Raspberry Pi ngokuhlanganisa i-Porcupine ye-wake word, i-Whisper (nge- whisper.cpp ) ye-ASR, ukukhetha kwakho ubuchopho bezimpendulo, kanye ne-Piper ye-TTS yendawo. Yisonge njengesevisi ye-systemd, lungisa umsindo, bese uxhumanisa izenzo ze-MQTT noma ze-HTTP. Kushibhile kunalokho ocabanga, futhi kuyajabulisa ngendlela exakile ukuhlala nayo. [1][2][3][4][5]


Izinkomba

  1. Isofthiwe Nokupholisa ye-Raspberry Pi – I-Raspberry Pi Imager (ukulanda nokusebenzisa) kanye nolwazi lomkhiqizo we-Pi 5 Active Cooler

  2. I-Porcupine Wake Word – i-SDK kanye nokuqala okusheshayo (amagama angukhiye, ukuzwela, ukuphetha kwendawo)

  3. I-Whisper (imodeli ye-ASR) – I-ASR enamandla, enezilimi eziningi eqeqeshwe ngamahora angama-680k

    • URadford nabanye, Ukuqashelwa Kwenkulumo Okuqinile Ngokuqondiswa Okukhulu Okubuthakathaka (Ukuhleba): funda kabanzi

  4. whisper.cpp - Ukuqagela kwe-Whisper okunobungani ne-CPU nge-CLI kanye nezinyathelo zokwakha

  5. I-Piper TTS – I-TTS yemizwa esheshayo, yendawo enamazwi/izilimi eziningi

Thola i-AI Yakamuva Esitolo Esisemthethweni Somsizi we-AI

Mayelana NATHI


Buyela kubhulogi