Ungawakha kanjani umsizi we-DIY AI nge-Raspberry Pi

Ungawakha kanjani umsizi we-DIY AI nge-Raspberry Pi

Ufuna umsizi wezwi omncane olandela ukuhola kwakho, osebenzisa ihadiwe yakho, futhi ongeke ngephutha ode uphayinaphu abayishumi nambili ngoba ukuzwanga kahle? Umsizi we-DIY AI one-Raspberry Pi uyafezeka ngendlela emangalisayo, uyajabulisa, futhi uyavumelana nezimo. Uzohlanganisa igama lokuvuka, ukubonwa kwenkulumo (ASR = ukunakwa kwenkulumo okuzenzakalelayo), ubuchopho bolimi lwemvelo (imithetho noma i-LLM), kanye nombhalo-kuya-enkulumweni (TTS). Engeza imibhalo embalwa, isevisi eyodwa noma ezimbili, nokunye okulungisiwe okucophelelayo komsindo, futhi unesipikha esihlakaniphile esiphathekayo esithobela imithetho yakho.

Ake sikususe ku-zero kuye kokuthi ukhulume-no-yakho-Pi ngaphandle kokudonsa izinwele okuvamile. Sizofaka izingxenye, ukusetha, ikhodi, ukuqhathanisa, ama-gotchas... yonke i-burrito. 🌯

Izindatshana ongathanda ukuzifunda ngemva kwalesi:

🔗 Ungayifunda kanjani i-AI ngempumelelo
Dala umgwaqo wokufunda, prakthiza amaphrojekthi, futhi ulandelele ukuqhubeka.

🔗 Ungayiqala kanjani inkampani ye-AI
Qinisekisa inkinga, yakha i-MVP, hlanganisa iqembu, amakhasimende okuqala avikelekile.

🔗 Isetshenziswa kanjani i-AI ukuze ikhiqize kakhudlwana
Yenza ngokuzenzakalelayo imisebenzi yejwayelo, qondisa ukugeleza komsebenzi, futhi uthuthukise umphumela wokudala.

🔗 Ungayifaka kanjani i-AI ebhizinisini lakho
Khomba izinqubo ezinomthelela omkhulu, sebenzisa abashayeli bezindiza, kala i-ROI, isikali.


Yini eyenza umsizi omuhle we-DIY AI nge-Raspberry Pi ✅

  • Kuyimfihlo ngokuzenzakalelayo - gcina umsindo usendaweni lapho kungenzeka khona. Uyanquma ukuthi yini eshiya idivayisi.

  • I-Modular - shintshanisa izingxenye ezifana ne-Lego: injini yegama lokuvuka, i-ASR, i-LLM, i-TTS.

  • Kuyathengeka - ikakhulukazi umthombo ovulekile, imakrofoni yempahla, izipikha, kanye ne-Pi.

  • I-Hackable - ufuna i-automation yasekhaya, amadeshibhodi, izinqubo, amakhono angokwezifiso? Kulula.

  • Inokwethenjelwa - ilawulwa ngesevisi, iqala futhi iqala ukulalela ngokuzenzakalelayo.

  • Kuyajabulisa - uzofunda okuningi mayelana nomsindo, izinqubo, kanye nomklamo oqhutshwa umcimbi.

Ithiphu elincane: Uma usebenzisa i-Raspberry Pi 5 futhi uhlela ukusebenzisa amamodeli endawo asindayo, i-clip-on cooler isiza ngaphansi komthwalo oqhubekayo. (Uma ungabaza, khetha I-Active Cooler esemthethweni eyenzelwe i-Pi 5.) [1]


Izingxenye Namathuluzi Ozowadinga 🧰

  • I-Raspberry Pi : I-Pi 4 noma i-Pi 5 enconyelwe i-headroom.

  • Ikhadi le-microSD : 32 GB+ liyanconywa.

  • Imakrofoni ye-USB : imakrofoni yenkomfa ye-USB elula muhle.

  • Isipikha : Isipika se-USB noma esingu-3.5 mm, noma i-I2S amp HAT.

  • Inethiwekhi : I-Ethernet noma i-Wi-Fi.

  • Izinto ezinhle ozikhethela zona: ikesi, i-cooler esebenzayo ye-Pi 5, inkinobho yokusunduza yokusunduza ukuze ukhulume, indandatho ye-LED. [1]

I-OS & Ukusethwa Kwesisekelo

  1. I-Flash Raspberry Pi OS ene-Raspberry Pi Imager. Yindlela eqondile yokuthola i-microSD ebhuthayo ngokusethwa ngaphambilini okufunayo. [1]

  2. Qalisa, xhuma kunethiwekhi, bese ubuyekeza amaphakheji:

sudo apt update && sudo apt upgrade -y
  1. Izisekelo zomsindo : Ku-Raspberry Pi OS ungasetha okukhiphayo okuzenzakalelayo, amaleveli, namadivayisi usebenzisa i-UI yedeskithophu noma i-raspi-config . I-USB ne-HDMI yomsindo isekelwa kuwo wonke amamodeli; Okukhiphayo kwe-Bluetooth kuyatholakala kumamodeli ane-Bluetooth. [1]

  2. Qinisekisa amadivayisi:

irekhodi -l dlala -l

Bese uhlola ukuthwebula nokudlala. Uma amaleveli ebonakala eyinqaba, hlola izihlanganisi nokuzenzakalelayo ngaphambi kokusola imakrofoni.


I-Architecture Ngokubuka nje 🗺️

onengqondo kwe-Raspberry Pi ubukeka kanje:

Vuselela igama → ukuthwebula komsindo okubukhoma → Ukuloba kwe-ASR → ukuphatha okuhlosiwe noma i-LLM → umbhalo wempendulo → TTS → ukudlalwa komsindo → izenzo ozikhethela zona nge-MQTT noma i-HTTP.

  • Wake izwi : I-Porcupine incane, inembile, futhi isebenza endaweni ngokulawula ukuzwela kwegama ngalinye elingukhiye. [2]

  • I-ASR : I-Whisper imodeli ye-ASR yezilimi eziningi, yezinjongo ezijwayelekile eqeqeshwa amahora angu-~680k; iqinile kuma-accents/umsindo wangemuva. Ngokusetshenziswa kudivayisi, i-whisper.cpp inikeza indlela enciphile ye-C/C++. [3][4]

  • Ubuchopho : Ukukhetha kwakho - i-LLM yamafu nge-API, injini yemithetho, noma okucatshangwayo kwasendaweni kuye ngamandla ehhashi.

  • I-TTS : I-Piper ikhiqiza inkulumo yemvelo endaweni, ngokushesha ngokwanele ukuze uthole izimpendulo ezisheshayo ku-hardware enesizotha. [5]


Ithebula Lokuqhathanisa Ngokushesha 🔎

Ithuluzi Kuhle kakhulu Inani-ish Kungani Isebenza
Izwi leNngungumbane I-trigger ehlala ilalela Isigaba samahhala + I-CPU ephansi, inembile, izibopho ezilula [2]
Whisper.cpp I-ASR yendawo ku-Pi Umthombo ovulekile Ukunemba okuhle, i-CPU-friendly [4]
Shesha-hleba I-ASR esheshayo ku-CPU/GPU Umthombo ovulekile Ukulungiselelwa kwe-CTranslate2
I-Piper TTS Okukhipha inkulumo yendawo Umthombo ovulekile Amazwi asheshayo, izilimi eziningi [5]
Cloud LLM API Ukucabanga okucebile Ukusetshenziswa okusekelwe Ilayisha ikhompuyutha enzima
I-Node-RED Izenzo zokuhlela Umthombo ovulekile Ukugeleza okubonakalayo, i-MQTT inobungane

Ukwakha Isinyathelo Ngesinyathelo: I-Voice Loop Yakho Yokuqala 🧩

Sizosebenzisa i-Porcupine ekuphenduleni igama, i-Whisper yokubhala, umsebenzi ongasindi “wobuchopho” ukuze uphendule (shintshanisa nge-LLM yakho oyikhethayo), kanye ne-Piper yokukhuluma. Kugcine kuncane, bese uphindaphinda.

1) Faka okuncikile

I-sudo ifanele ukufaka -y python3-pip portaudio19-dev sox ffmpeg pip3 faka i-sounddevice numpy
  • I-Porcupine: bamba i-SDK/izibopho zolimi lwakho bese ulandela isiqalo esisheshayo (ukhiye wokufinyelela + uhlu lwamagama angukhiye + ozimele bomsindo → .process ). [2]

  • Whisper (CPU-friendly): build whisper.cpp :

git clone https://github.com/ggml-org/whisper.cpp cd whisper.cpp && cmake -B ukwakha && cmake --build build -j ./models/download-ggml-model.sh base.en ./build/bin/whisper-cli -m .

Okungenhla kukhombisa ukuqala okusheshayo kwephrojekthi. [4]

Uncamela iPython? i-speed-whisper (CTranslate2) ivamise ukushesha kune-vanilla Python kuma-CPU anesizotha.

2) Setha i-Piper TTS

git clone https://github.com/rhasspy/piper cd piper yenza # Landa imodeli yezwi oyithandayo, isb, en_US-amy echo "Sawubona lapho." | ./piper --model voices/zu/US-amy-medium.onnx --output_file hello.wav aplay hello.wav

I-Piper yakhelwe i-TTS ekudivayisi enezinketho eziningi zezwi/zolimi. [5]

3) Iluphu yomsizi omncane kuPython

Ihlanganiswe ngamabomu: ilinda umushwana wokuvuka (stub), amarekhodi, alobe nge- whisper.cpp , ikhiqiza impendulo (isimeli), bese ikhuluma nge-Piper. Shintsha isimeli nge-LLM oyikhonzile noma umthetho logic.

import os, subprocess, wave import sounddevice as sd WAKE_WORD = "hey computer" # swap for Porcupine in production [2] RECORD_SECONDS = 6 SAMPLE_RATE = 16000 CHANNELS = 1 WORKDIR = "/home/pi/assistant" ASR_BIN = "/home/pi/whisper.cpp/build/bin/whisper-cli" # [4] ASR_MODEL = "/home/pi/whisper.cpp/models/ggml-base.en.bin" PIPER_BIN = "/home/pi/piper/build/piper" # [5] PIPER_VO "/home/pi/piper/voices/zu/zu_US-amy-medium.onnx" os.makedirs(WORKDIR, exist_ok=True) def record_wav(indlela, imizuzwana=RECORD_SECONDS): okulalelwayo = sd.rec(int(imizuzwana * SAMPLE_RATESAMPLE,CHANTELS_SAMPLE), samplerate=CHANTELS dtype='int16') sd.wait() nge-wave.open(indlela, 'wb') njenge-w: w.setnchannels(CHANNELS); w.setsampwidth(2); w.setframerate(SAMPLE_RATE) w.writeframes(audio.tobytes()) def transcribe(indlela): cmd = [ASR_BIN, "-m", ASR_MODEL, "-f", path, "-otxt"] subprocess.run(cmd, check=True, cwd. vula(WORKDI)=Open". ".txt"), "r", encoding="utf-8") as f: return f.read().strip() def generate_reply(prompt): uma "isimo sezulu" in prompt.lower(): buyisela "Angiwaboni amafu, kodwa kungase kulunge. Letha ibhantshi uma kwenzeka." buyisela "Uthe: " + prompt def speak(umbhalo): proc = subprocess.Popen([PIPER_BIN, "--model", PIPER_VOICE, "--output_file", f"{WORKDIR}/reply.wav"], stdin=subprocess.PIPE) proc.stdin.write(utfen8));(umbhalo. i-proc.stdin.close(); proc.wait() subprocess.run(["i-aplay", f"{WORKDIR}/reply.wav"], check=Iqiniso) phrinta("Umsizi ulungile. Thayipha umusho wokuvuka ukuze uhlole.") kuyilapho Iqiniso: typed = input("> ").strip().lower() uma ibhaliwe == WAKE_WORD: f"{WORKDI = wav_path. record_wav(wav_path) text = bhala(wav_path) reply = generate_reply(text) phrinta("Umsebenzisi:", umbhalo); print("Umsizi:", phendula) khuluma(phendula) okunye: phrinta("Thayipha umusho wokuvuka ukuze uhlole iluphu.")

Ukuze uthole i-wake-word yangempela, hlanganisa umtshina wokusakaza we-Porcupine (i-CPU ephansi, ukuzwela kwegama elingukhiye ngalinye). [2]


Ukushuna Umsindo Okubalulekile Ngempela 🎚️

Ukulungiswa okuncane okumbalwa kwenza umsizi wakho azizwe ehlakaniphile ngo-10×:

  • Ibanga lemakrofoni : 30–60 cm iyindawo emnandi kumakrofoni amaningi e-USB.

  • Amazinga : gwema ukusika okokufaka futhi ugcine ukudlala kunengqondo; lungisa umzila ngaphambi kokujaha izipoki zekhodi. Ku-Raspberry Pi OS, ungaphatha idivayisi yokukhiphayo namazinga ngokusebenzisa amathuluzi ohlelo noma i-raspi-config . [1]

  • I-acoustics yegumbi : izindonga eziqinile zibangela ama-echoes; umata othambile ngaphansi kwemakrofoni uyasiza.

  • Vusa umkhawulo wegama : uzwela kakhulu → izicupho eziyisipoki; uqine kakhulu → uzobe uthethisa upulasitiki. I-Porcupine ikuvumela ukuthi ulungise ukuzwela ngegama elingukhiye. [2]

  • Ama-Thermals : okulotshiweyo okude ku-Pi 5 kuyazuza kokupholisa okusemthethweni okusebenzayo kokusebenza okuqhubekayo. [1]


Ukusuka Kuthoyizi Kuya Entweni Esetshenziswayo: Izinsizakalo, Ukuqalisa Okuzenzakalelayo, Ukuhlola Impilo 🧯

Abantu bayakhohlwa ukusebenzisa imibhalo. Amakhompyutha akhohlwe ukuba muhle. Guqula iluphu yakho ibe yisevisi ephethwe:

  1. Dala iyunithi ye-systemd:

[Iyunithi] Incazelo=Umsizi Wezwi we-DIY Ngemva=network.target sound.target [Isevisi] Umsebenzisi=pi WorkingDirectory=/home/pi/umsizi ExecStart=/usr/bin/python3 /home/pi/assistant/assistant.py Restart=always RestartSec=3 [Faka] Wanted-usertar=multi
  1. Ivumele:

sudo cp assistant.service /etc/systemd/system/ sudo systemctl daemon-reload sudo systemctl vumela --now assistant.service
  1. Imisila yelogi:

journalctl -u umsizi -f

Manje iqala ku-boot, iqala kabusha ekuphahlazekeni, futhi ngokuvamile iziphatha njengomshini. Okuncane okuyisicefe, okungcono kakhulu.


Uhlelo Lwekhono: Kwenze Lusebenzise Ngempela Ekhaya 🏠✨

Uma i-voice-in ne-voice-out isiqinile, engeza izenzo:

  • Irutha yenhloso : imizila yegama elingukhiye elula yemisebenzi evamile.

  • Ikhaya elihlakaniphile : shicilela imicimbi ku-MQTT noma shayela izindawo zokugcina ze-HTTP Zomsizi Wasekhaya.

  • Ama-plugin : imisebenzi esheshayo yePython efana ne -set_timer , what_is_the_time , play_radio , run_scene .

Ngisho ne-LLM yefu ku-loop, sebenzisa imiyalo yendawo esobala kuqala ngesivinini nokuthembeka.


I-Local Only vs Cloud Assist: Ukuhwebelana Uzozizwa 🌓

zasendaweni kuphela
: okuyimfihlo, okungaxhunyiwe ku-inthanethi, izindleko ezingabikezelwa.
Ububi: amamodeli asindayo angase ahambe kancane emabhodini amancane. Ukuqeqeshwa kwe-Whisper ngezilimi eziningi kusiza ngokuqina uma ukugcine kudivayisi noma kuseva eseduze. [3]

I-Cloud assist
Pros: ukucabanga okunamandla, amawindi womongo amakhulu.
Ububi: idatha ishiya idivayisi, ukuncika kwenethiwekhi, izindleko eziguquguqukayo.

I-hybrid ivamise ukuwina: wake word + ASR local → shayela i-API ukuze ucabange → i-TTS yendawo. [2][3][5]


Ukuxazulula inkinga: I-Strange Gremlins & Quick Fixes 👾

  • Vusa izingcipho zegama ezingamanga : ukuzwela okuphansi noma zama imakrofoni ehlukile. [2]

  • I-ASR lag : sebenzisa imodeli ye-Whisper encane noma yakha i-whisper.cpp ngamafulegi okukhululwa ( -j --config Release ). [4]

  • I-Choppy TTS : dala ngaphambili imishwana evamile; qinisekisa idivayisi yakho yomsindo nezilinganiso zesampula.

  • Ayikho imakrofoni etholiwe : hlola i-arecord -l nezihlanganisi.

  • I-Thermal throttling : sebenzisa i-Active Cooler esemthethweni ku-Pi 5 ukuze uthole ukusebenza okuqhubekayo. [1]


Amanothi Okuphepha Nobumfihlo Okufanele Uwafunde Ngempela 🔒

  • Gcina i-Pi yakho ibuyekezwa nge-APT.

  • Uma usebenzisa noma iyiphi i-API yamafu, faka lokho okuthumelayo futhi ucabange ukuhlela kabusha izingcezu zomuntu endaweni kuqala.

  • Qalisa izinsizakalo ngokunenzuzo encane; gwema i-sudo ku-ExecStart ngaphandle uma kudingeka.

  • Nikeza ngemodi yendawo kuphela yezivakashi noma amahora athule.


Yakha Okuhlukile: Hlanganisa Futhi Ufanise NjengeSandwich 🥪

  • I-Ultra-local : I-Porcupine + whisper.cpp + Piper + imithetho elula. Iyimfihlo futhi iqinile. [2][4][5]

  • Usizo lwamafu olusheshayo : I-Porcupine + (i-Whisper yendawo encane noma i-cloud ASR) + TTS yendawo + yefu LLM.

  • Isikhungo se-automation yasekhaya : Engeza ukugeleza kwe-Node-RED noma Komsizi Wasekhaya wemijikelezo, izigcawu, nezinzwa.


Isibonelo Samakhono: Izibani Zivuliwe nge-MQTT 💡

import paho.mqtt.client as mqtt MQTT_HOST = "192.168.1.10" TOPIC = "ikhaya/igumbi lokuhlala/ukukhanya/setha" def set_light(state: str): client = mqtt.Client() client.connect(MQTT_HOST, 1883, 60) i-payload = enye i-payload. "CIMILE" client.publish(TOPIC, payload, qos=1, retain=False) client.disconnect() # uma "khanyisa izibani" embhalweni: set_light("khanyisa")

Engeza umugqa wezwi onjengokuthi: "khanyisa isibani segumbi lokuhlala," futhi uzozizwa unjenge wizadi.


Kungani Lesi Sitaki Sisebenza Ngokuzijwayeza 🧪

  • I-Porcupine iyasebenza futhi inembile ekutholeni amazwi e-wake-word kumabhodi amancane, okwenza ukulalela njalo kwenzeke. [2]

  • Ukuqeqeshwa kwe-Whisper okukhulu, ngezilimi eziningi kuyenza iqine ezindaweni ezihlukene kanye nezimpawu zokuphimisela. [3]

  • i-whisper.cpp igcina lawo mandla esebenza kumadivayisi e-CPU kuphela njenge-Pi. [4]

  • I-Piper igcina izimpendulo zisheshayo ngaphandle kokuthumela umsindo ku-TTS yamafu. [5]


Inde Kakhulu, Angiyifundanga

Yakha umsizi we-DIY AI oyimodyuli, oyimfihlo nge-Raspberry Pi ngokuhlanganisa iNngungumbane yezwi lokuvuka, iWhisper (nge- whisper.cpp ) ye-ASR, ukukhetha kwakho kwengqondo ukuze uthole izimpendulo, kanye ne-Piper ye-TTS yendawo. Isonge njengesevisi yesistimu, shuna umsindo, nentambo ku-MQTT noma izenzo ze-HTTP. Kushibhile kunalokho ucabanga, futhi kujabulisa ngendlela eyinqaba ukuhlala nakho. [1][2][3][4][5]


Izithenjwa

  1. I-Raspberry Pi Software & Ukupholisa - I-Raspberry Pi Imager (landa futhi usebenzise) kanye nolwazi lomkhiqizo we-Pi 5 Active Cooler

  2. I-Porcupine Wake Word - SDK & isiqalo esisheshayo (amagama angukhiye, ukuzwela, ukuchazwa kwendawo)

  3. I-Whisper (imodeli ye-ASR) - Izilimi eziningi, i-ASR eqinile eqeqeshwe ~ ~ 680k amahora

    • U-Radford et al., Ukuqashelwa Kwenkulumo Eqinile Ngokuqondiswa Okubuthakathaka Kwesikali Esikhulu (Ukuhleba): funda kabanzi

  4. whisper.cpp – I-CPU-friendly Whisper inference ne-CLI futhi wakhe izinyathelo

  5. I-Piper TTS - I-TTS esheshayo, yendawo ye-neural enamazwi/izilimi eziningi

Thola i-AI yakamuva esitolo esisemthethweni somsizi we-AI

Mayelana NATHI


Buyela kubhulogi